👻 See our latest reviews to choose the best laptop for Machine Learning and Deep learning tasks!
Is there a numpy builtin to do something like the following? That is, take a list d
and return a list filtered_d
with any outlying elements removed based on some assumed distribution of the points in d
.
import numpy as np
def reject_outliers(data):
m = 2
u = np.mean(data)
s = np.std(data)
filtered = [e for e in data if (u - 2 * s < e < u + 2 * s)]
return filtered
>>> d = [2,4,5,1,6,5,40]
>>> filtered_d = reject_outliers(d)
>>> print filtered_d
[2,4,5,1,6,5]
I say "something like" because the function might allow for varying distributions (poisson, gaussian, etc.) and varying outlier thresholds within those distributions (like the m
I"ve used here).
👻 Read also: what is the best laptop for engineering students in 2022?
Is there a numpy builtin to reject outliers from a list filter: Questions
List comprehension vs. lambda + filter
5 answers
I happened to find myself having a basic filtering need: I have a list and I have to filter it by an attribute of the items.
My code looked like this:
my_list = [x for x in my_list if x.attribute == value]
But then I thought, wouldn"t it be better to write it like this?
my_list = filter(lambda x: x.attribute == value, my_list)
It"s more readable, and if needed for performance the lambda could be taken out to gain something.
Question is: are there any caveats in using the second way? Any performance difference? Am I missing the Pythonic Way‚Ñ¢ entirely and should do it in yet another way (such as using itemgetter instead of the lambda)?
Answer #1
It is strange how much beauty varies for different people. I find the list comprehension much clearer than filter
+lambda
, but use whichever you find easier.
There are two things that may slow down your use of filter
.
The first is the function call overhead: as soon as you use a Python function (whether created by def
or lambda
) it is likely that filter will be slower than the list comprehension. It almost certainly is not enough to matter, and you shouldn"t think much about performance until you"ve timed your code and found it to be a bottleneck, but the difference will be there.
The other overhead that might apply is that the lambda is being forced to access a scoped variable (value
). That is slower than accessing a local variable and in Python 2.x the list comprehension only accesses local variables. If you are using Python 3.x the list comprehension runs in a separate function so it will also be accessing value
through a closure and this difference won"t apply.
The other option to consider is to use a generator instead of a list comprehension:
def filterbyvalue(seq, value):
for el in seq:
if el.attribute==value: yield el
Then in your main code (which is where readability really matters) you"ve replaced both list comprehension and filter with a hopefully meaningful function name.
Answer #2
This is a somewhat religious issue in Python. Even though Guido considered removing map
, filter
and reduce
from Python 3, there was enough of a backlash that in the end only reduce
was moved from built-ins to functools.reduce.
Personally I find list comprehensions easier to read. It is more explicit what is happening from the expression [i for i in list if i.attribute == value]
as all the behaviour is on the surface not inside the filter function.
I would not worry too much about the performance difference between the two approaches as it is marginal. I would really only optimise this if it proved to be the bottleneck in your application which is unlikely.
Also since the BDFL wanted filter
gone from the language then surely that automatically makes list comprehensions more Pythonic ;-)
Is there a numpy builtin to reject outliers from a list filter: Questions
How do I do a not equal in Django queryset filtering?
5 answers
In Django model QuerySets, I see that there is a __gt
and __lt
for comparative values, but is there a __ne
or !=
(not equals)? I want to filter out using a not equals. For example, for
Model:
bool a;
int x;
I want to do
results = Model.objects.exclude(a=True, x!=5)
The !=
is not correct syntax. I also tried __ne
.
I ended up using:
results = Model.objects.exclude(a=True, x__lt=5).exclude(a=True, x__gt=5)
Answer #1
You can use Q objects for this. They can be negated with the ~
operator and combined much like normal Python expressions:
from myapp.models import Entry
from django.db.models import Q
Entry.objects.filter(~Q(id=3))
will return all entries except the one(s) with 3
as their ID:
[<Entry: Entry object>, <Entry: Entry object>, <Entry: Entry object>, ...]
Meaning of @classmethod and @staticmethod for beginner?
5 answers
Could someone explain to me the meaning of @classmethod
and @staticmethod
in python? I need to know the difference and the meaning.
As far as I understand, @classmethod
tells a class that it"s a method which should be inherited into subclasses, or... something. However, what"s the point of that? Why not just define the class method without adding @classmethod
or @staticmethod
or any @
definitions?
tl;dr: when should I use them, why should I use them, and how should I use them?
Answer #1
Though classmethod
and staticmethod
are quite similar, there"s a slight difference in usage for both entities: classmethod
must have a reference to a class object as the first parameter, whereas staticmethod
can have no parameters at all.
Example
class Date(object):
def __init__(self, day=0, month=0, year=0):
self.day = day
self.month = month
self.year = year
@classmethod
def from_string(cls, date_as_string):
day, month, year = map(int, date_as_string.split("-"))
date1 = cls(day, month, year)
return date1
@staticmethod
def is_date_valid(date_as_string):
day, month, year = map(int, date_as_string.split("-"))
return day <= 31 and month <= 12 and year <= 3999
date2 = Date.from_string("11-09-2012")
is_date = Date.is_date_valid("11-09-2012")
Explanation
Let"s assume an example of a class, dealing with date information (this will be our boilerplate):
class Date(object):
def __init__(self, day=0, month=0, year=0):
self.day = day
self.month = month
self.year = year
This class obviously could be used to store information about certain dates (without timezone information; let"s assume all dates are presented in UTC).
Here we have __init__
, a typical initializer of Python class instances, which receives arguments as a typical instancemethod
, having the first non-optional argument (self
) that holds a reference to a newly created instance.
Class Method
We have some tasks that can be nicely done using classmethod
s.
Let"s assume that we want to create a lot of Date
class instances having date information coming from an outer source encoded as a string with format "dd-mm-yyyy". Suppose we have to do this in different places in the source code of our project.
So what we must do here is:
- Parse a string to receive day, month and year as three integer variables or a 3-item tuple consisting of that variable.
- Instantiate
Date
by passing those values to the initialization call.
This will look like:
day, month, year = map(int, string_date.split("-"))
date1 = Date(day, month, year)
For this purpose, C++ can implement such a feature with overloading, but Python lacks this overloading. Instead, we can use classmethod
. Let"s create another "constructor".
@classmethod
def from_string(cls, date_as_string):
day, month, year = map(int, date_as_string.split("-"))
date1 = cls(day, month, year)
return date1
date2 = Date.from_string("11-09-2012")
Let"s look more carefully at the above implementation, and review what advantages we have here:
- We"ve implemented date string parsing in one place and it"s reusable now.
- Encapsulation works fine here (if you think that you could implement string parsing as a single function elsewhere, this solution fits the OOP paradigm far better).
cls
is an object that holds the class itself, not an instance of the class. It"s pretty cool because if we inherit ourDate
class, all children will havefrom_string
defined also.
Static method
What about staticmethod
? It"s pretty similar to classmethod
but doesn"t take any obligatory parameters (like a class method or instance method does).
Let"s look at the next use case.
We have a date string that we want to validate somehow. This task is also logically bound to the Date
class we"ve used so far, but doesn"t require instantiation of it.
Here is where staticmethod
can be useful. Let"s look at the next piece of code:
@staticmethod
def is_date_valid(date_as_string):
day, month, year = map(int, date_as_string.split("-"))
return day <= 31 and month <= 12 and year <= 3999
# usage:
is_date = Date.is_date_valid("11-09-2012")
So, as we can see from usage of staticmethod
, we don"t have any access to what the class is---it"s basically just a function, called syntactically like a method, but without access to the object and its internals (fields and another methods), while classmethod does.
Answer #2
Rostyslav Dzinko"s answer is very appropriate. I thought I could highlight one other reason you should choose @classmethod
over @staticmethod
when you are creating an additional constructor.
In the example above, Rostyslav used the @classmethod
from_string
as a Factory to create Date
objects from otherwise unacceptable parameters. The same can be done with @staticmethod
as is shown in the code below:
class Date:
def __init__(self, month, day, year):
self.month = month
self.day = day
self.year = year
def display(self):
return "{0}-{1}-{2}".format(self.month, self.day, self.year)
@staticmethod
def millenium(month, day):
return Date(month, day, 2000)
new_year = Date(1, 1, 2013) # Creates a new Date object
millenium_new_year = Date.millenium(1, 1) # also creates a Date object.
# Proof:
new_year.display() # "1-1-2013"
millenium_new_year.display() # "1-1-2000"
isinstance(new_year, Date) # True
isinstance(millenium_new_year, Date) # True
Thus both new_year
and millenium_new_year
are instances of the Date
class.
But, if you observe closely, the Factory process is hard-coded to create Date
objects no matter what. What this means is that even if the Date
class is subclassed, the subclasses will still create plain Date
objects (without any properties of the subclass). See that in the example below:
class DateTime(Date):
def display(self):
return "{0}-{1}-{2} - 00:00:00PM".format(self.month, self.day, self.year)
datetime1 = DateTime(10, 10, 1990)
datetime2 = DateTime.millenium(10, 10)
isinstance(datetime1, DateTime) # True
isinstance(datetime2, DateTime) # False
datetime1.display() # returns "10-10-1990 - 00:00:00PM"
datetime2.display() # returns "10-10-2000" because it"s not a DateTime object but a Date object. Check the implementation of the millenium method on the Date class for more details.
datetime2
is not an instance of DateTime
? WTF? Well, that"s because of the @staticmethod
decorator used.
In most cases, this is undesired. If what you want is a Factory method that is aware of the class that called it, then @classmethod
is what you need.
Rewriting Date.millenium
as (that"s the only part of the above code that changes):
@classmethod
def millenium(cls, month, day):
return cls(month, day, 2000)
ensures that the class
is not hard-coded but rather learnt. cls
can be any subclass. The resulting object
will rightly be an instance of cls
.
Let"s test that out:
datetime1 = DateTime(10, 10, 1990)
datetime2 = DateTime.millenium(10, 10)
isinstance(datetime1, DateTime) # True
isinstance(datetime2, DateTime) # True
datetime1.display() # "10-10-1990 - 00:00:00PM"
datetime2.display() # "10-10-2000 - 00:00:00PM"
The reason is, as you know by now, that @classmethod
was used instead of @staticmethod
Answer #3
@classmethod
means: when this method is called, we pass the class as the first argument instead of the instance of that class (as we normally do with methods). This means you can use the class and its properties inside that method rather than a particular instance.
@staticmethod
means: when this method is called, we don"t pass an instance of the class to it (as we normally do with methods). This means you can put a function inside a class but you can"t access the instance of that class (this is useful when your method does not use the instance).