bs4.FeatureNotFound: Couldn”t find a tree builder with the features you requested: lxml. Do you need to install a parser library?

| | | | |
...
soup = BeautifulSoup(html, "lxml")
File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 152, in __init__
% ";".join(features))
bs4.FeatureNotFound: Couldn"t find a tree builder with the features you requested: lxml. Do you need to install a parser library?

The above outputs on my Terminal. I am on Mac OS 10.7.x. I have Python 2.7.1, and followed this tutorial to get Beautiful Soup and lxml, which both installed successfully and work with a separate test file located here. In the Python script that causes this error, I have included this line: from pageCrawler import comparePages And in the pageCrawler file I have included the following two lines: from bs4 import BeautifulSoup from urllib2 import urlopen

Any help in figuring out what the problem is and how it can be solved would much be appreciated.

bs4.FeatureNotFound: Couldn"t find a tree builder with the features you requested: lxml. Do you need to install a parser library? find: Questions

Finding the index of an item in a list

5 answers

Given a list ["foo", "bar", "baz"] and an item in the list "bar", how do I get its index (1) in Python?

3740

Answer #1

>>> ["foo", "bar", "baz"].index("bar")
1

Reference: Data Structures > More on Lists

Caveats follow

Note that while this is perhaps the cleanest way to answer the question as asked, index is a rather weak component of the list API, and I can"t remember the last time I used it in anger. It"s been pointed out to me in the comments that because this answer is heavily referenced, it should be made more complete. Some caveats about list.index follow. It is probably worth initially taking a look at the documentation for it:

list.index(x[, start[, end]])

Return zero-based index in the list of the first item whose value is equal to x. Raises a ValueError if there is no such item.

The optional arguments start and end are interpreted as in the slice notation and are used to limit the search to a particular subsequence of the list. The returned index is computed relative to the beginning of the full sequence rather than the start argument.

Linear time-complexity in list length

An index call checks every element of the list in order, until it finds a match. If your list is long, and you don"t know roughly where in the list it occurs, this search could become a bottleneck. In that case, you should consider a different data structure. Note that if you know roughly where to find the match, you can give index a hint. For instance, in this snippet, l.index(999_999, 999_990, 1_000_000) is roughly five orders of magnitude faster than straight l.index(999_999), because the former only has to search 10 entries, while the latter searches a million:

>>> import timeit
>>> timeit.timeit("l.index(999_999)", setup="l = list(range(0, 1_000_000))", number=1000)
9.356267921015387
>>> timeit.timeit("l.index(999_999, 999_990, 1_000_000)", setup="l = list(range(0, 1_000_000))", number=1000)
0.0004404920036904514
 

Only returns the index of the first match to its argument

A call to index searches through the list in order until it finds a match, and stops there. If you expect to need indices of more matches, you should use a list comprehension, or generator expression.

>>> [1, 1].index(1)
0
>>> [i for i, e in enumerate([1, 2, 1]) if e == 1]
[0, 2]
>>> g = (i for i, e in enumerate([1, 2, 1]) if e == 1)
>>> next(g)
0
>>> next(g)
2

Most places where I once would have used index, I now use a list comprehension or generator expression because they"re more generalizable. So if you"re considering reaching for index, take a look at these excellent Python features.

Throws if element not present in list

A call to index results in a ValueError if the item"s not present.

>>> [1, 1].index(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 2 is not in list

If the item might not be present in the list, you should either

  1. Check for it first with item in my_list (clean, readable approach), or
  2. Wrap the index call in a try/except block which catches ValueError (probably faster, at least when the list to search is long, and the item is usually present.)

3740

Answer #2

One thing that is really helpful in learning Python is to use the interactive help function:

>>> help(["foo", "bar", "baz"])
Help on list object:

class list(object)
 ...

 |
 |  index(...)
 |      L.index(value, [start, [stop]]) -> integer -- return first index of value
 |

which will often lead you to the method you are looking for.

3740

Answer #3

The majority of answers explain how to find a single index, but their methods do not return multiple indexes if the item is in the list multiple times. Use enumerate():

for i, j in enumerate(["foo", "bar", "baz"]):
    if j == "bar":
        print(i)

The index() function only returns the first occurrence, while enumerate() returns all occurrences.

As a list comprehension:

[i for i, j in enumerate(["foo", "bar", "baz"]) if j == "bar"]

Here"s also another small solution with itertools.count() (which is pretty much the same approach as enumerate):

from itertools import izip as zip, count # izip for maximum efficiency
[i for i, j in zip(count(), ["foo", "bar", "baz"]) if j == "bar"]

This is more efficient for larger lists than using enumerate():

$ python -m timeit -s "from itertools import izip as zip, count" "[i for i, j in zip(count(), ["foo", "bar", "baz"]*500) if j == "bar"]"
10000 loops, best of 3: 174 usec per loop
$ python -m timeit "[i for i, j in enumerate(["foo", "bar", "baz"]*500) if j == "bar"]"
10000 loops, best of 3: 196 usec per loop

bs4.FeatureNotFound: Couldn"t find a tree builder with the features you requested: lxml. Do you need to install a parser library? iat: Questions

iat

InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately

3 answers

Tried to perform REST GET through python requests with the following code and I got error.

Code snip:

import requests
header = {"Authorization": "Bearer..."}
url = az_base_url + az_subscription_id + "/resourcegroups/Default-Networking/resources?" + az_api_version
r = requests.get(url, headers=header)

Error:

/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:79: 
          InsecurePlatformWarning: A true SSLContext object is not available. 
          This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. 
          For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning

My python version is 2.7.3. I tried to install urllib3 and requests[security] as some other thread suggests, I still got the same error.

Wonder if anyone can provide some tips?

334

Answer #1

The docs give a fair indicator of what"s required., however requests allow us to skip a few steps:

You only need to install the security package extras (thanks @admdrew for pointing it out)

$ pip install requests[security]

or, install them directly:

$ pip install pyopenssl ndg-httpsclient pyasn1

Requests will then automatically inject pyopenssl into urllib3


If you"re on ubuntu, you may run into trouble installing pyopenssl, you"ll need these dependencies:

$ apt-get install libffi-dev libssl-dev

iat

Dynamic instantiation from string name of a class in dynamically imported module?

3 answers

In python, I have to instantiate certain class, knowing its name in a string, but this class "lives" in a dynamically imported module. An example follows:

loader-class script:

import sys
class loader:
  def __init__(self, module_name, class_name): # both args are strings
    try:
      __import__(module_name)
      modul = sys.modules[module_name]
      instance = modul.class_name() # obviously this doesn"t works, here is my main problem!
    except ImportError:
       # manage import error

some-dynamically-loaded-module script:

class myName:
  # etc...

I use this arrangement to make any dynamically-loaded-module to be used by the loader-class following certain predefined behaviours in the dyn-loaded-modules...

222

Answer #1

You can use getattr

getattr(module, class_name)

to access the class. More complete code:

module = __import__(module_name)
class_ = getattr(module, class_name)
instance = class_()

As mentioned below, we may use importlib

import importlib
module = importlib.import_module(module_name)
class_ = getattr(module, class_name)
instance = class_()

iat

How to get all of the immediate subdirectories in Python

3 answers

I"m trying to write a simple Python script that will copy a index.tpl to index.html in all of the subdirectories (with a few exceptions).

I"m getting bogged down by trying to get the list of subdirectories.

184

Answer #1

import os
def get_immediate_subdirectories(a_dir):
    return [name for name in os.listdir(a_dir)
            if os.path.isdir(os.path.join(a_dir, name))]

Shop

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Best laptop for Zoom

$499

Best laptop for Minecraft

$590

Latest questions

NUMPYNUMPY

psycopg2: insert multiple rows with one query

12 answers

NUMPYNUMPY

How to convert Nonetype to int or string?

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Javascript Error: IPython is not defined in JupyterLab

12 answers

News

Wiki

Python OpenCV | cv2.putText () method

numpy.arctan2 () in Python

Python | os.path.realpath () method

Python OpenCV | cv2.circle () method

Python OpenCV cv2.cvtColor () method

Python - Move item to the end of the list

time.perf_counter () function in Python

Check if one list is a subset of another in Python

Python os.path.join () method