Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graphite-api not returning data with multiple retentions #165

Open
gburson opened this issue May 5, 2016 · 5 comments
Open

graphite-api not returning data with multiple retentions #165

gburson opened this issue May 5, 2016 · 5 comments

Comments

@gburson
Copy link

gburson commented May 5, 2016

Hello,

I'm really scratching my head here. We've been running a grafana/graphite-api/carbon/whisper stack for a while now and it's working generally ok. However, I've noticed that if I drill into data in grafana, once I get to a certain level of detail, the chart is blank.

Here is some config. Our storage schema looks like this, store on a 10 sec interval for 7 days, then 1 minute for 2 years.

[Web_Prod]
priority = 90
pattern = ^Production..web..WebServer.*
retentions = 10s:7d,1m:2y

I can verify this in the whisper files themselves, like this: -

/usr/local/src/whisper/bin/whisper-dump.py /opt/graphite/storage/whisper/Production/Live/web/web2-vm/WebServer/Customer/HPS.wsp | less

Meta data:RETURN)
aggregation method: average
max retention: 63072000
xFilesFactor: 0

Archive 0 info:
offset: 40
seconds per point: 10
points: 60480
retention: 604800
size: 725760

Archive 1 info:
offset: 725800
seconds per point: 60
points: 1051200
retention: 63072000
size: 12614400

I've noticed the problem only happens, when querying data older than 7 days i..e after it's been averaged to a 60 second interval. If I pick a time period older than 7 days, across a three minute interval, and look directly inside the whisper file, it all looks good: -

/usr/local/src/whisper/bin/whisper-fetch.py --from 1454230700 --until 1454230880 /opt/graphite/storage/whisper/Production/Live/web/web2-vm/WebServer/Customer/HPS.wsp

1454230740 8.000000
1454230800 8.700000
1454230860 8.233333

However, if I query through graphite-api, it returns a 10 second interval (the wrong retention period, because I'm querying older than 7 days), and all items (even the ones that match the timestamps above) are null.

http://www.dashboard.com/render?target=Production.Live.web.web2-vm.WebServer.Customer.HPS&from=1454230700&until=1454230880&format=json&maxDataPoints=1000

[{"target": "Production.Live.web.571854-web2-vm.WebServer.Customer.HPS", "datapoints": [[null, 1454230710], [null, 1454230720], [null, 1454230730], [null, 1454230740], [null, 1454230750], [null, 1454230760], [null, 1454230770], [null, 1454230780], [null, 1454230790], [null, 1454230800], [null, 1454230810], [null, 1454230820], [null, 1454230830], [null, 1454230840], [null, 1454230850], [null, 1454230860], [null, 1454230870], [null, 1454230880]]}]

If I go for a wider time span, I start to get data back, but some are null and some are populated. What am I doing wrong?!

Thanks,
Glen.

@lukyanov
Copy link

I can confirm the issue. While whisper itself returns data as expected, according to configured retentions, graphite-api only correctly works within the interval of first retention. Besides "zooming" described above it also seems to affect functions like timeShift().

@gburson
Copy link
Author

gburson commented May 16, 2016

Yes likewise, I've seen other issues now, and I think this is a general bug with graphite-api. I'll see if there is a way of raising a bug with the project.

@gburson
Copy link
Author

gburson commented May 23, 2016

My team have finally found the cause of this and fixed in the source so you can zoom in on old data, it was a bug in one copy of the whisper code we have: -

/usr/share/python/graphite/lib/python2.7/site-packages/graphite_api/_vendor/whisper.py

The call to read the data from the file had:

diff = untilTime - fromTime
for archive in header['archives']:
if archive['retention'] >= diff:
break

this should be

diff = now - fromTime
for archive in header['archives']:
if archive['retention'] >= diff:
break

the other copies of whisper.py on the server are all OK. Interestingly the incorrect one is a later version, the bug seems to have been introduced as a ‘fix’ here graphite-project/whisper@ccd0c89 , but with no explanation as to why the change was made.

If anyone could shed any light that would be cool!

@brutasse
Copy link
Owner

Here's a summary of the attempted "fix": graphite-project/whisper#139

I ported it indeed, then reverted it but there has been no release since the revert.

I have juste pushed 1.1.3 which should fix the regression. Let me know how it works for you.

@lukyanov
Copy link

@brutasse Could you also update your docker image as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants