Skip to content
This repository has been archived by the owner on Oct 23, 2019. It is now read-only.

New module, speccy.py #208

Open
wants to merge 43 commits into
base: gonzobot
Choose a base branch
from
Open

New module, speccy.py #208

wants to merge 43 commits into from

Conversation

Jessexd
Copy link

@Jessexd Jessexd commented Feb 17, 2018

Features:

Parses speccy links, e.g. http://speccy.piriform.com/results/IX6JcGoJaJhzinvBpyLBtbm and spits out relevant info.

Sample responses for various links:

<bot> (nick) OS: Windows 10 Home 64-bit ● RAM: 8.00GB ● CPU: Intel Core 2 Quad Q9400 @ 2.66GHz Yorkfield 45nm Technology ● GPU: NVIDIA GeForce 9800 GT (PNY) ● Badware: None ● Failing Drive(s): Disk: #1

<bot> (nick) OS: Windows 10 Education 64-bit ● RAM: 16.0GB Dual-Channel Unknown @ 1069MHz (15-15-15-36) ● CPU: Intel Core i7 6700K @ 4.00GHz: 26 °C Skylake 14nm Technology ● GPU: 4095MB NVIDIA GeForce GTX 970 (MSI): 58 °C ● Badware: Killer Service V2 ● Failing Drive(s): None

<bot> (nick) OS: Windows 10 Pro 64-bit ● RAM: 8.00GB Dual-Channel DDR3 @ 799MHz (11-11-11-28) ● CPU: AMD FX-8350: 20 °C Vishera 32nm Technology ● GPU: 4095MB NVIDIA GeForce GTX 1050 Ti (Gigabyte): 34 °C ● Badware: AutoPico Daily Restart, Service KMSELDI ● Failing Drive(s): None

Commands:

None as of yet, therefore no help information either. It purely automatically parses speccy links.


@asyncio.coroutine
@hook.event([EventType.message, EventType.action], singlethread=True)
def get_speccy_url(conn, message, chan, content, nick):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use hook.regex() rather than re-implementing the functionality.

with open(speccy.name, 'wb') as f:
f.write(respHtml)

soup = BS(open(speccy.name), "lxml-xml")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use a file here? soup = BS(response.content, 'lxml-xml') would work just fine

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to look into the other two requests, but about this one. This was actually an optimization improvement. If you know a better way lmk.

https://i.imgur.com/G2Si7PP.png

I noticed it also parses the link faster too:

https://i.imgur.com/R1eKAZT.png

test.py = the method you mentioned test2.py = the current method seen in snippet

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include your test case. Also the time command shows system resource usage not actual elapsed time. Disk I/O has a lot of waiting since disk I/O is slow. This code is also running in the main thread as you have marked it as a coroutine, so any blockling I/O like an HTTP request or file write will halt the entire bot until it finishes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't exactly know why it's faster, it was just a guess that ended up working, but here so you can try yourself:

https://gist.github.com/Jessexd/14990cbadad5e62a22fb590bb25a4f9d

Should I add multithreaded=True ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you should remove the @asyncio.coroutine decorator on your hook function, which is marking it as a coroutine, which runs in the main event loop.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, alright, as for BS, what should be done in terms of parsing links faster instead of writing to disk?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally just pass the content from the response straight in.

import requests

from bs4 import BeautifulSoup

with requests.get(url) as response:
    soup = BeautifulSoup(response.content, 'lxml', from_encoding=response.encoding)

print(soup)

It may not be the fastest option, but it's the standard way as something like disk i/o can be extremely unpredictable.

try:
boosterspec = soup.body.find(
"div", text=re.compile('.*booster', re.I)).text
except AttributeError:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be replaces with if soup.body.find(): whatever since .find() will return None when it can't find a matching element.

edwinksl pushed a commit to edwinksl/CloudBot that referenced this pull request Feb 17, 2018
Shutdown all running hooks when a plugin is unloaded
@Jessexd
Copy link
Author

Jessexd commented Feb 18, 2018

redid as per reviewers requested changes

if gpuspec:
data.append("\x02GPU:\x02" + " " + gpuspec)

picospec = body.find("div", text=re.compile('.*pico', re.I))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend compiling the regex once globally and reusing it rather than recompiling it on every call.

if picospec:
data.append("\x02Badware:\x02" + " " + picospec.text)

kmsspec = body.find("div", text=re.compile('.*kms', re.I))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend compiling the regex once globally and reusing it rather than recompiling it on every call.

if kmsspec:
data.append("\x02Badware:\x02" + " " + kmsspec.text)

boosterspec = body.find("div", text=re.compile('.*booster', re.I))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend compiling the regex once globally and reusing it rather than recompiling it on every call.

if boosterspec:
data.append("\x02Badware:\x02" + " " + boosterspec.text)

reviverspec = body.find("div", text=re.compile('.*reviver', re.I))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend compiling the regex once globally and reusing it rather than recompiling it on every call.

if reviverspec:
data.append("\x02Badware:\x02" + " " + reviverspec.text)

killerspec = body.find("div", text=re.compile('.*Killer.+Service', re.I))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend compiling the regex once globally and reusing it rather than recompiling it on every call.

def get_speccy_url(message, match, chan, nick):

with closing(requests.get(match.group(), stream=True,
timeout=3)) as response:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you don't read the body as a stream, specifying stream=True has no effect, The timeout is also quite low, no reason for that to be honest, as even if the request is slow, other things on the bot will continue to run fine. Waiting on a response from a server is very light on system resources,



@hook.regex(url_re)
def get_speccy_url(message, match, chan, nick):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nick, message, and chan are unused parameters, no need to include them.

from bs4 import BeautifulSoup as BS
from contextlib import closing

url_re = re.compile('https?:\/\/speccy.piriform.com\/results\/[A-z0-9]+', re.I)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll want to escape those ., otherwise they will match any character.
I also recommend against doing [A-z] as that includes all characters between the upper and lowercase letters, such as [, \, ], ^, _, and `.

except Exception:
smartspec = None

specout = re.sub("\s{2,}|\r\n|\n", " ", str(' + '.join(data)))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be better as re.sub(r"\s+", " ", ' + '.join(data)) as \s covers all whitespace, including CR and LF.


try:
z = smartcheck()
if len(z) != 0:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check if a list is empty, you should use the __bool__ value, don't check the length as that can be an expensive operation on a large list.
Something like: if z: instead of if len(z) != 0:

@Jessexd
Copy link
Author

Jessexd commented Feb 19, 2018

Hopefully nailed all requested changes and conformed to standards/quality.

@@ -0,0 +1,94 @@
# -*- coding: utf-8 -*-

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not include backup files in the PR.


values = []
for i in range(0, number_of_drives):
drives = drivespec[i].next_sibling.next_sibling.stripped_strings

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be replaced with

for i, spec in enumerate(drivespec):
    drives = spec.next_sibling.next_sibling.stripped_strings

Instead of repeatedly calling __getitem__ on the list.

linuxdaemon
linuxdaemon previously approved these changes Feb 20, 2018
MuffinMedic and others added 2 commits February 19, 2018 22:23
speccy.py: output format change, removed a cloudbot regex redundancy, optimization, removed redundant return None's
Jessexd and others added 6 commits February 28, 2018 14:43
…or anymore, new features: Memory usage C:\ Space left
Added checking ability, shouldn't return nothing in the chat upon error anymore, new features: Memory usage C:\ Space left
return c_free_space_data.find(class_="datavalue").text

if cspace():
data.append("\x02Space Left (C:\):\x02" + " " + cspace())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid repeatedly calling a function like this. It would be better to assign it to a variable before the check.

"div", class_="blue clear",
text='Physical Memory').next_sibling.next_sibling.find(
"div", text='Memory Usage:\xA0',
class_="datakey").parent.find(class_="datavalue").text

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if body.find("div", class_="blue clear", text='Physical Memory') returns None here, it will cause an error. You should do a None check like you do in the rest of the function.

data.append(
"\x02CPU:\x02" + " " + cpu_spec.next_sibling.next_sibling.text)

gpu_find = body.find("div", text='Graphics').next_sibling.next_sibling.text

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should do a None check before using the result of .find()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify/correct what you mean? In this one snippet it's including code for CPU and GPU.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments apply to the line directly above them, so in this case gpu_find

data.append("\x02Badware:\x02" + " " + killer_spec.text)

if 'Badware' not in specout():
data.append("\x02No Badware\x02")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be easier to just set a variable like has_badware = False and set that to True whenever you find something that's considered badware.

def cspace():
c_drive = body.find("div", class_="datavalue", text='C:')
if c_drive is not None:
c_drive = body.parent.parent

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is supposed to be c_drive = c_drive.parent.parent

@Jessexd
Copy link
Author

Jessexd commented Mar 4, 2018

Pushed requested changes


def smartcheck():
drive__spec = body.find_all("div", class_="blue clear", text="05")
drive_spec_SMART_checked = body.find("div", class_="blue clear", text="05")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this .find() duplicated?

killer_spec = body.find("div", text=KILLER_RE)
if killer_spec is not None:
has_badware = True
data.append("\x02Badware:\x02" + " " + killer_spec.text)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the badware logic could probably be collapsed to something like

badware_patterns = [KILLER_RE, REVIVER_RE, BOOSTER_RE, KMS_RE, PICO_RE]
badware = body.find_all("div", text=badware_patterns)
if badware:
    for tag in badware:
        data.append("\x02Badware:\x02" + " " + tag.text)
else:
    data.append("\x02No Badware\x02")

Rather than duplicating the code. Then any new Badware patterns just need to be added to that list, and that list could even be defined with the rest of the regexs like:

BADWARE_PATTERNS = [...]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, lots better

raw_value = SMART[rv_index + 1]
if raw_value != "0000000000":
values.append(str(i + 1))
return values

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why even use values here if you just return the first thing you find?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


def smartcheck():
drive_spec = body.find_all("div", class_="blue clear", text="05")
if drive_spec is not None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive_spec will never be None. .find_all() always returns a list, it may just be empty


drives = smartcheck()
if drives is not None:
for item in drives:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drives will always be a string or None here, as you changed how smartcheck() works. If you want to get info for multiple drives, you'll need to change the logic in smartcheck()

@Jessexd
Copy link
Author

Jessexd commented Mar 7, 2018

@linuxdaemon
Here's the change.

'.*reviver', re.I)
KILLER_RE = \
re.compile(
'.*killer.+service', re.I)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this line continuation needed? It makes the code much more unreadable

Copy link
Author

@Jessexd Jessexd Mar 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thought it looked better, as I see now It's kind of crowded. I hate those line warnings but I want so much to keep it pep8 acceptable.


@hook.regex(url_re)
def get_speccy_url(match):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extraneous whitespace

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't pinpoint what you see, as again, it's since been the same many revisions ago.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the extra line after the def line. I just didn't catch it in earlier reviews.

@hook.regex(url_re)
def get_speccy_url(match):

with closing(requests.get(match.group(), )) as response:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extraneous comma and whitespace

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you're referring to, these 4 lines have been the same since many reviewed (by you) revisions ago.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I'm referring to the comma and space after match.group()

cpu_spec = body.find("div", text='CPU')
if cpu_spec is not None:
data.append(
"\x02CPU:\x02" + " " + cpu_spec.next_sibling.next_sibling.text)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three if block could most likely be generalized to a function call since they are almost exactly the same.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll compress that too.

drive_spec_sn = \
drive_spec_sn_check.next_sibling.next_sibling.text
drive_spec_bad = True
if drive_spec_bad is True:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you check if drive_spec_bad is True directly after setting it to True?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't realize that, tested fine, but I'll remove that certainly.

drive_spec_sn_find = drive_spec_smart.parent.parent \
.previous_sibling.previous_sibling.previous_sibling \
.previous_sibling.previous_element.previous_element \
.previous_sibling.previous_sibling

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely you can use a different criteria to find this element. Like matching an specific attribute or something.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see how I can compress that where it's less obviously ridiculous.

for drive_spec_smart_found in drive_spec_smart_find.find_all(
"div", class_="datakey", text="Raw Value:\xA0"):
drives = drive_spec_smart_found.next_sibling.next_sibling
if "0000000000" in drives != 0:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not even sure what this is trying to accomplish.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checks if 0000000000 in the grabbed 05 div of HTML. 0000000000 being the raw SMART value. If it's anything that is anything but 0000000000, it's a broken drive. Below would be obviously what to do if it's not 0000000000.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the comparison to 0 though? surely if "0000000000" in drives" is what you want?

"div", class_="datakey", text="Raw Value:\xA0"):
drives = drive_spec_smart_found.next_sibling.next_sibling
if "0000000000" in drives != 0:
str.strip('0')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement doesn't actually do anything.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, thought it did. One less piece of code needed once I test it. 👍

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

str is the raw string type. str.strip('0') is equivalent to '0'.strip() and since you don't set the resturn value of that statement to anything, this statement doesn't do anything. What you should do is examine why you originally added this line and see if it was a mistake in case you do still need to strip a string.

@linuxdaemon
Copy link

Do you have any intention of continuing work on this plugin?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants