Change language

Optimizing NumPy for speed and efficiency

Optimizing NumPy for speed and efficiency

[Music] [Music] numpy is is the workhorse with data analysis even if you dont use it directly its most probably what what what you are doing uh you the libraries that you are using either pandas or psychic learn or or whatever they are using numpy for their work so you have to be cognizant of numpy and you you even if you dont use it directly you have to be careful and im going to change it a little bit im going to to the last point here and imagine that you dont use numpy directly you still have to worry about numpy architecture because numpy uses a set of lower level libraries uh provided by by other systems these are called blas and i forgot i forgot and uh and lena i forgot the name of the other one with plus and linoleg i think i i just it just slipped my mind but so these libraries are implemented by somebody else and for instance the stock library in most linuxes is not multi-threaded and you have but you have many many alternatives there open blocks for example or intels mql and when you when you have your your your numpy implementation you have to be careful with uh and check what is the the the blas libraries that are uh that are linked because if they are not multi-threaded they will not make use of all your cores and if for some reason you are doing massive matrix processing you really lose a lot there so my studio so my suggest even if you dont program a single line in in numpy you have to be careful with what are you linking and if you use i i think many debian ubuntu distributions i think the default linked library its actually non-threaded if you use condor anaconda they will link uh they will link open plus and that should be enough you can also you can also link mql and there are other alternatives but i think if if you have open plus and mql you should be even if you dont write a line of numpy you should look at this and and you should be uh you should be uh concerned with that um furthermore and uh there are so so numpy is used by everybody uh in in the uh or almost everybody in in the data in the data in the data analytics space and it uses uh arrays which are n-dimensional arrays which is essentially an array right it is essentially a uh sequential allocation in memory of day of data with with the same type in a general term it can even be a python object but there is not not a big point in that most most of the types that we are interested in are booleans or integers or floats and and we want to be careful if they are signed or unsigned and then we we have this blob of of memory this memory view uh we we have this piece of memory which is mostly managed by by by numpy and and which is essentially the the you have all this all this data in in in a native in a cpu native format lets call it like that no its not really but lets call it like that and then we have a certain set of metadata and the metadata is things that you can actually tweak so for instance if you have an array of four by five this is actually a two dimensional matrix of four by five you can you can actually tweak the metadata uh you can actually tweak the metadata uh to be a 1d array of 20 off or some or five by f of or or chang or transpose it automatically without doing any costs and i and there are and you can invert the direction without inverting the uh without inverting uh the representation in memory you just you just tell numpy interpret these with a different stride so there are lots of optimizations that can be done there where you imagine that you have an array and you want to you to transpose it you actually dont need to copy the the overall radial data in the uh the the old. the url.ndra you just create a new view and and you share the same data of course when when you do that you have to be careful because you end up with with a single array with a single raw data and many views so you are sparing a lot of memory but if you change a little bit if you change one of the view the data in one of the views its reflected in all of the other views but uh as i go through that in the book with images because if when you copy an image you and you you change the original image you dont see it on the next one but if you if you do a view where you invert the image uh in direction and you change the other view it will affect both of them but the concept of views is is very important um [Music] the concept of views is is very important to spare memory and computation with numpy another topic that i go through is universal functions and and this is going to reappear substantially in in in in future chapters especially the ability to to process a single element imagine if you have a equals array be b and this is three and [Music] what did i do wrong let me just oh i dont have number here so i dont have a number so this would write this would write something along the lines of uh five seven nine okay so quite simple but we you one could actually do a function that takes an element this is obviously trivial right that does a plus b okay and this function we could do this a universal function and and apply this function uh to both the arrays and this function would be called three times one for each element and the there are several interesting things going on here but one is that this is this is the way gpu uh gpus work where essentially you have lots and lots of processes doing the same doing the same doing the same operation and if this function is implemented in another language other than python they can they can be called in parallel so this kind of mentality based on universal functions [Music] this kind of mentality can very easily extrapolate to parallel code when using cyton and can extrapolate to and can extrapolate to gpu computing and there will be a chapter at the end about about uh doing gpu computing with the uh with python and it requires this mentality of if you have a big array you dont do a four over four and and do the processing element by element that thats a complete waste even even without a gpu uh but but you do this uh element by element processing there is also another issue with universal functions it has to do with broadcasting which which which i wont go through here because its not really a processing issue but i have to discuss it in the book there is also the issue of of of broadcasting uh but the the the point here the the most important point here is that you have a ginormous array you you instead of doing a four over four of the two dimensions or whatever relaxed assuming is the matrix you do a function that processes a single case and then you ask numpy okay call me this function over the all array thousands millions of times whatever and again this is good for gpus and it and it is good for further optimization if you lose a lower level programming language you

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

Common xlabel/ylabel for matplotlib subplots

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

12 answers

NUMPYNUMPY

Flake8: Ignore specific warning for entire file

12 answers

NUMPYNUMPY

glob exclude pattern

12 answers

NUMPYNUMPY

How to avoid HTTP error 429 (Too Many Requests) python

12 answers

NUMPYNUMPY

Python CSV error: line contains NULL byte

12 answers

NUMPYNUMPY

csv.Error: iterator should return strings, not bytes

12 answers

News


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

sin

How to specify multiple return types using type-hints

exp

Printing words vertically in Python

exp

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

cos

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically