Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip install pyhs2 is not installing the latest master branch? #61

Open
mkmoisen opened this issue Apr 28, 2016 · 8 comments
Open

pip install pyhs2 is not installing the latest master branch? #61

mkmoisen opened this issue Apr 28, 2016 · 8 comments

Comments

@mkmoisen
Copy link

If I issue a cur.fetchmany(i), it fails with

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/apps/Python/lib/python2.7/site-packages/pyhs2/cursor.py", line 152, in fetchmany
    if size < 0 or size > MAX_BLOCK_SIZE:
NameError: global name 'MAX_BLOCK_SIZE' is not defined

I can see in this repo that the code has been changed to if size < 0 or size > self.MAX_BLOCK_SIZE:

However, the version installed by pip doesn't have the self and it is throwing this error.

Is there a mismatch between this repo and pip?

pip search pyhs2
pyhs2                     - Python Hive Server 2 Client Driver
  INSTALLED: 0.6.0 (latest)
@anshanno
Copy link

Have you had any luck with this? I've been experiencing the same issue and haven't been able to find anything aside from this on the subject

@mkmoisen
Copy link
Author

Are you referring to the cur.fetchmany() or the pip issue?

@anshanno
Copy link

@mkmoisen well, they are sort of intertwined since pip is installing a slightly different version than the repo. I am getting the same error when I try to use fetchmany().

@mkmoisen
Copy link
Author

mkmoisen commented May 26, 2016

@anshanno Got it. I ended up uninstalling it from pip, cloning the latest git repo
and then installing it using the setup.py file.

Best regards,

Matthew Moisen

@anshanno
Copy link

@mkmoisen Alright, thanks. What is the best practice for using .fetchmany()?

@mkmoisen
Copy link
Author

mkmoisen commented May 26, 2016

@anshanno I actually gave up on fetchmany and use fetchall instead. fetchmany appears to have an annoying bug that I've described in my pull request that you can take a look at.

This library also uses string instead of unicode, which caused me some errors with Flask/Jinja templates. I've raised another pull request for that.

My application is a simple read only app. My general flow is the following. I use a thread to obtain a connection to the Database, because I noticed that when HS2 is down, it hangs for a long time. If the thread exceed a certain time limit I'll throw an exception to fail gracefully.

def _thread_get_connection(database, ret, e):
    '''
    If HS2 is down, a connection hangs a long time before raising an exception
    Run the connection in a thread class so that it can be cut off in 5 seconds
    :param ret: a dict to hold the conn so that it can be returned
    :param e: a threading.Event()
    '''
    conn = pyhs2.connect(host=HIVE2_HOST,
                         port=HIVE2_PORT,
                         authMechanism="PLAIN",
                         user=HIVE2_USER,
                         password=HIVE2_PASSWORD,
                         database=database)

    # If the thread continues after returning an error, this will close the connection  in the event
    # that the connection actually went through
    if e.isSet():
        close_connection(conn)

    ret['conn'] = conn

def get_connection(database):
    '''
    Helper function to get a connection in a thread
    '''
    ret = {}
    e = threading.Event()
    t = threading.Thread(target=_thread_get_connection, args=(database, ret, e))
    t.start()
    t.join(5)
    if t.is_alive():
        e.set()
        app.logger.exception("Cannot connect to HS2, it must be down")
        return None

    return ret['conn']

# ... The following is wrapped in try except `pyhs2.Phys2Exception`
con = get_connection(database)
if con is None:
    raise ServerException("HS2 is down, cannot connect!")

with conn.cursor() as cur:
    cur.execute(select)
    # Get field names out of schema. Remove "table_name." if a select * was executed
    schema = cur.getSchema()
    columns = [field['columnName'][field['columnName'].index('.') + 1:]
                   if '.' in field['columnName'] else field['columnName']
                   for field in schema]
    rows = cur.fetchall()
    # Convert string to unicode
    rows = [[value.decode('utf-8') if isinstance(value, str) else u'' if value is None else value for value in row] for row in rows]

@kpweiler
Copy link

This is still an issue with whatever source is being used on PyPI. The git master branch works fine, but not the source on PyPI.

@mkmoisen
Copy link
Author

@kpweiler Do you mean to say that this is a known issue with all repos, not just this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants