Closed
Bug 748417
Opened 12 years ago
Closed 12 years ago
write python script to convert TelemetryHistograms.h to JSON
Categories
(Toolkit :: Telemetry, defect)
Toolkit
Telemetry
Tracking
()
RESOLVED
FIXED
mozilla18
People
(Reporter: taras.mozilla, Assigned: froydnj)
References
Details
Attachments
(5 files, 15 obsolete files)
13.11 KB,
patch
|
taras.mozilla
:
review+
|
Details | Diff | Splinter Review |
11.41 KB,
patch
|
taras.mozilla
:
review+
|
Details | Diff | Splinter Review |
166.94 KB,
text/plain
|
Details | |
3.03 KB,
patch
|
taras.mozilla
:
review+
|
Details | Diff | Splinter Review |
15.15 KB,
patch
|
froydnj
:
review+
akeybl
:
approval-mozilla-aurora-
|
Details | Diff | Splinter Review |
We need this to validate histograms on serverside
Reporter | ||
Updated•12 years ago
|
Assignee: nobody → taras.mozilla
Assignee | ||
Comment 1•12 years ago
|
||
If this is going to happen, it would be great if the script could also build gHistograms during the build, so that we didn't have char*s in gHistograms, thereby making things smaller and avoiding relocations. Maybe a separate bug?
Comment 2•12 years ago
|
||
The valid histograms change over time; the plan is to keep this JSON object around for each buildid?
Comment 3•12 years ago
|
||
> Maybe a separate bug?
Think so, yes. :)
Reporter | ||
Comment 4•12 years ago
|
||
Reporter | ||
Comment 5•12 years ago
|
||
Sample json output. This reports units, axis labels. I could port c++ logic to report exact bucket #s and replace those with units when available. However before I do that work, xstevens, is that required? This script also highlighted some telemetry abuse, perhaps we should run it as part of the build as a way to generate nathan's table(This script would become a module for bug 748444)
Reporter | ||
Comment 6•12 years ago
|
||
If anyone has a good idea for a strategy for unit testing python in our code, I'm open to suggestions.
Assignee | ||
Comment 7•12 years ago
|
||
You could just abuse the C preprocessor to generate that output, you know. (Save for maybe the labels bit, which I didn't look at very closely.) That would help avoid issues with multi-line HISTOGRAM definitions. The idl parser has unit testing bits, though I don't know if those bits get run as part of the build.
Comment 8•12 years ago
|
||
> That would help avoid issues with multi-line HISTOGRAM definitions.
And with escaped " marks.
Reporter | ||
Comment 9•12 years ago
|
||
I don't think I could. I don't know how to collapse "string""s" with CPP. Python provides an opportunity for extra error/convention checking, parsing string comment for units/labels/etc.
Assignee | ||
Comment 10•12 years ago
|
||
(In reply to Taras Glek (:taras) from comment #9) > I don't think I could. I don't know how to collapse "string""s" with CPP. > Python provides an opportunity for extra error/convention checking, parsing > string comment for units/labels/etc. Mmm, good point. Don't forget that you have to handle all the #defines in the .h too. Maybe filter through CPP to get something consistent to run through Python?
Reporter | ||
Comment 11•12 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #10) > (In reply to Taras Glek (:taras) from comment #9) > > I don't think I could. I don't know how to collapse "string""s" with CPP. > > Python provides an opportunity for extra error/convention checking, parsing > > string comment for units/labels/etc. > > Mmm, good point. > > Don't forget that you have to handle all the #defines in the .h too. Maybe > filter through CPP to get something consistent to run through Python? That's what the script does...It's the hip new thing, making up for lacking CPP features with python :)
Assignee | ||
Comment 12•12 years ago
|
||
(In reply to Taras Glek (:taras) from comment #11) > (In reply to Nathan Froyd (:froydnj) from comment #10) > > Don't forget that you have to handle all the #defines in the .h too. Maybe > > filter through CPP to get something consistent to run through Python? > > That's what the script does...It's the hip new thing, making up for lacking > CPP features with python :) That's what I get for trying to make intelligent comments when I'm tired.
Comment 13•12 years ago
|
||
FYI subprocesss.check_output requires Python 2.7, and that is not a build requirement iirc. So if you wanted to run an automated test, I think you'd have to do something else.
Reporter | ||
Comment 14•12 years ago
|
||
Attachment #618000 -
Attachment is obsolete: true
Reporter | ||
Comment 15•12 years ago
|
||
Attachment #618004 -
Attachment is obsolete: true
Reporter | ||
Comment 16•12 years ago
|
||
This now handles multiple histograms per line.
Attachment #618360 -
Attachment is obsolete: true
Comment 17•12 years ago
|
||
Just a quick status update on this: The script requires features that only exist in python 2.7 and higher. I tried to run it with Python 3, but ran into some different errors that mean it might not be compatible out of the box there. The metrics infrastructure that we could easily run this script on only have Python 2.6 installed by default, so if we need to run this on our side, we have to figure out whether we try to get a new python version in our environment or whether the script can be ported to work on 2.6. When run on Python 2.7, it threw this error: File "h_to_json.py", line 94, in parse success = self.add(*HGRAM_RE.match(line).groups()) and success AttributeError: 'NoneType' object has no attribute 'groups' Which is likely just an edge case that needs to be handled with a try/catch and some logging. The script isn't overly complex, so anyone could productionize it given some time, but we need to figure out who is the best person to get it done based on the priority of the validation feature to other Telemetry and Metrics work. I'd be really happy if this might be something that could be incorporated into the build environment such that Metrics can just expect the JSON output to arrive in a pre-designated place for us to consume whenever needed. Lawrence said he'd chat with one of the build guys to get a sense of whether that is doable.
Reporter | ||
Comment 18•12 years ago
|
||
(In reply to Daniel Einspanjer :dre [:deinspanjer] from comment #17) > Just a quick status update on this: > > The script requires features that only exist in python 2.7 and higher. I > tried to run it with Python 3, but ran into some different errors that mean > it might not be compatible out of the box there. > The metrics infrastructure that we could easily run this script on only have > Python 2.6 installed by default, so if we need to run this on our side, we > have to figure out whether we try to get a new python version in our > environment or whether the script can be ported to work on 2.6. > > When run on Python 2.7, it threw this error: > File "h_to_json.py", line 94, in parse > success = self.add(*HGRAM_RE.match(line).groups()) and success > AttributeError: 'NoneType' object has no attribute 'groups' > > Which is likely just an edge case that needs to be handled with a try/catch > and some logging. The script isn't overly complex, so anyone could > productionize it given some time, but we need to figure out who is the best > person to get it done based on the priority of the validation feature to > other Telemetry and Metrics work. I wrote this script to spare you guys the indignity of parsing TelemetryHistograms.h all on your own. > > I'd be really happy if this might be something that could be incorporated > into the build environment such that Metrics can just expect the JSON output > to arrive in a pre-designated place for us to consume whenever needed. > Lawrence said he'd chat with one of the build guys to get a sense of whether > that is doable. The the only way to get this into the build is to get it doing something useful(ie bug 748444). I agree we should move in that direction. However pushing json from our build infrastructure onto metrics sounds like overengineering. Metrics can setup a cronjob to pull the script + telemetryhistograms.h from hg nightly and validate off that. We will also need to back away from using newer python features(2.7 ftw). My coding time is limited, I wrote code accordingly so people don't block on me. In the meantime the script is a near-complete wip and it's a reasonable expectation to have someone complete the script enough to do a few validation runs. mreid has already shown a knack for that. There is no need to block on any infrastructure in the nearterm.
Assignee | ||
Comment 19•12 years ago
|
||
We had some confusion over in bug 781531 about how to adjust to the new world order of histograms being defined in JSON. The scripts here will need some updating for generating, e.g. bucket information. As a first step, we need to refactor the bits we have so people aren't forced to continually derive the ranges of enumerated histograms, and so forth. This patch introduces a histogram_tools.Histogram class that performs a lot of the heavy lifting, adjusts histogram_tools.from_file to return instances of the class, rather than (name, definition) pairs, and generally shuffles the world around. It applies *on top* of the patches from bug 781531 and produces identical results for the generated files. Generating the bucket data should be a much easier task with this patch in place, and I'll get to that tomorrow. Only asking for f? at this point because the dependent bits haven't been r+'d yet. Feel free to r+ it or just request that this be rolled into bug 781531.
Attachment #654774 -
Flags: feedback?(taras.mozilla)
Reporter | ||
Comment 20•12 years ago
|
||
Comment on attachment 654774 [details] [diff] [review] better histogram tools rubberstamp. I like the general idea. This needs more commenting so i can follow along before I can r+ it.
Attachment #654774 -
Flags: feedback?(taras.mozilla) → feedback+
Assignee | ||
Comment 21•12 years ago
|
||
Rebased to what landed on inbound today. What do you mean by "better commenting"? Better explanation or more comments in the source code itself?
Attachment #654774 -
Attachment is obsolete: true
Reporter | ||
Comment 22•12 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #21) > Created attachment 655143 [details] [diff] [review] > better histogram tools > > Rebased to what landed on inbound today. > > What do you mean by "better commenting"? Better explanation or more > comments in the source code itself? Descriptive comments describing key functions in the module.
Assignee | ||
Comment 23•12 years ago
|
||
Now with some comments.
Attachment #655143 -
Attachment is obsolete: true
Attachment #655686 -
Flags: review?(taras.mozilla)
Assignee | ||
Comment 24•12 years ago
|
||
If we're going to generate range information in Python separately from the range information in the histogram class, we should make sure the two implementations match. This patch does that for DEBUG builds.
Attachment #655789 -
Flags: review?(taras.mozilla)
Reporter | ||
Comment 25•12 years ago
|
||
Comment on attachment 655686 [details] [diff] [review] better histogram tools + table_dispatch(definition['kind'], table, + lambda allowed_keys: Histogram.check_keys(name, definition, allowed_keys)) It's too bad Guido hates currying. I noticed that you went from no comments to: + def description(self): + """Return the description of the histogram.""" + return self._description My rule of thumb is that code has to be obvious enough to not need comments. Prior iteration wasn't obvious enough to me, this one tipped the balance the other way around.
Attachment #655686 -
Flags: review?(taras.mozilla) → review+
Reporter | ||
Comment 26•12 years ago
|
||
Comment on attachment 655789 [details] [diff] [review] check Python-generated range information at compile time + // Check that what Histogram thinks the ranges are lines up with what + // our informational scripts think the ranges are. typo r+. I do not feel strongly about nits below, use your judgement. + "Script calculates invalid # of buckets"); I think C++/Python bucket # mismatch might be easier to diagnose if it goes wrong + "Script calculates invalid information!"); same idea here try_to_coerce_to_integer is a bit of a mess with (either this didn't work or we are using an old crappy python). I do not think it's a good idea to have codepaths that diverge on our infrastructure and local machines. I think we should make toplevel except: clause as the default. ast module looks nice, but in this case it does not seem to simplify life. Also to be pedantic, try_to_coerce_to_integer actually coerces to a primitive, integer in the name is misleading.
Attachment #655789 -
Flags: review?(taras.mozilla) → review+
Assignee | ||
Comment 27•12 years ago
|
||
(In reply to Taras Glek (:taras) from comment #25) > It's too bad Guido hates currying. It's too bad Guido hates functional programming generally. > I noticed that you went from no comments to: > + def description(self): > + """Return the description of the histogram.""" > + return self._description > > My rule of thumb is that code has to be obvious enough to not need comments. > Prior iteration wasn't obvious enough to me, this one tipped the balance the > other way around. I think you are being inconsistent. (In reply to Taras Glek (:taras) from comment #26) > + // Check that what Histogram thinks the ranges are lines up with what > + // our informational scripts think the ranges are. > typo Where? > + "Script calculates invalid # of buckets"); > I think C++/Python bucket # mismatch might be easier to diagnose if it goes > wrong > > + "Script calculates invalid information!"); > same idea here I will make these more informative. > try_to_coerce_to_integer is a bit of a mess with (either this didn't work or > we are using an old crappy python). I do not think it's a good idea to have > codepaths that diverge on our infrastructure and local machines. I think we > should make toplevel except: clause as the default. ast module looks nice, > but in this case it does not seem to simplify life. Fair enough.
Assignee | ||
Comment 28•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/01b657335f1d https://hg.mozilla.org/integration/mozilla-inbound/rev/f8c53a2ecb2b
Assignee: taras.mozilla → nfroyd
Status: NEW → ASSIGNED
OS: Windows 7 → All
Hardware: x86_64 → All
Whiteboard: [leave open]
Comment 29•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/01b657335f1d https://hg.mozilla.org/mozilla-central/rev/f8c53a2ecb2b
Assignee | ||
Comment 30•12 years ago
|
||
A new version of Taras's original script that generates the required json from Histograms.json. Daniel, does this script output everything Metrics would like to know about? Example output being attached shortly.
Attachment #656554 -
Flags: review?(taras.mozilla)
Attachment #656554 -
Flags: feedback?(deinspanjer)
Assignee | ||
Comment 31•12 years ago
|
||
New JSON generated from the new script.
Attachment #618361 -
Attachment is obsolete: true
Assignee | ||
Comment 32•12 years ago
|
||
We may want to make the histogram parameters for boolean and flag histograms in the Python scripts match the parameters in the C++ source code...otherwise things look a little weird in the generated JSON.
Comment 33•12 years ago
|
||
Comment on attachment 656554 [details] [diff] [review] generate bucket information for histograms for metrics's consumption I believe we also need the histogram type so we can distinguish between linear and such. Please forgive the - if this is no longer relevant for some reason. Adding Harsha for feedback since he is working on the validation part.
Attachment #656554 -
Flags: feedback?(schintalapani)
Attachment #656554 -
Flags: feedback?(deinspanjer)
Attachment #656554 -
Flags: feedback-
Assignee | ||
Comment 34•12 years ago
|
||
Ah, fair point about the histogram type.
Attachment #656554 -
Attachment is obsolete: true
Attachment #656554 -
Flags: review?(taras.mozilla)
Attachment #656554 -
Flags: feedback?(schintalapani)
Attachment #656562 -
Flags: feedback?(schintalapani)
Comment 35•12 years ago
|
||
Would the output of this script be wrapped up in something higher level that would describe the product, channel, and any other attributes that might be relevant such as platform or locale? Would that information be inferred from the location in the source tree from which this output is retrieved?
Reporter | ||
Comment 36•12 years ago
|
||
Nathan, this also needs to duplicate histograms that match a certain regexp into STARTUP_ ones as per http://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetryPing.js#513
Assignee | ||
Comment 37•12 years ago
|
||
(In reply to Daniel Einspanjer :dre [:deinspanjer] from comment #35) > Would the output of this script be wrapped up in something higher level that > would describe the product, channel, and any other attributes that might be > relevant such as platform or locale? Would that information be inferred > from the location in the source tree from which this output is retrieved? We could do that. I'm not clear on what the best way to distribute this to you from a build is; it's not a user-facing thing, so it shouldn't be packaged and it's not a development/sdk-like thing, so it shouldn't go in the sdk, but it should somehow go alongside the packages themselves...
Assignee | ||
Comment 38•12 years ago
|
||
(In reply to Taras Glek (:taras) from comment #36) > Nathan, this also needs to duplicate histograms that match a certain regexp > into STARTUP_ ones as per > http://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/ > TelemetryPing.js#513 That's a good point. Fixed.
Attachment #656562 -
Attachment is obsolete: true
Attachment #656562 -
Flags: feedback?(schintalapani)
Attachment #656585 -
Flags: feedback?(schintalapani)
Comment 39•12 years ago
|
||
we need both bucket info and histogram type.
Assignee | ||
Comment 40•12 years ago
|
||
(In reply to schintalapani from comment #39) > we need both bucket info and histogram type. What's "bucket info" in this context? Just the lower bounds of each bucket, or something more elaborate?
Comment 41•12 years ago
|
||
its an array containing all the values example "buckets": [0, 1, 3, 8, 21, 57, 154, 414, 1114, 3000]
Comment 42•12 years ago
|
||
Hi Nathan, Is the script ready to use. Thanks
Assignee | ||
Comment 43•12 years ago
|
||
Please run the script, examine the output, and indicate whether you think the output is sufficient for your needs. If it is sufficient, we'll get it checked into the tree next week.
Comment 44•12 years ago
|
||
could you give me some instructions running it there is "script v2" what should be the input for this and what about this one https://bug748417.bugzilla.mozilla.org/attachment.cgi?id=656585 thanks, Harsha
Assignee | ||
Comment 45•12 years ago
|
||
https://bug748417.bugzilla.mozilla.org/attachment.cgi?id=656585 is what you want to use. Save it on your machine as gen-histogram-bucket-ranges.py and run from a command prompt: python /path/to/gen-histogram-bucket-ranges.py /path/to/Histograms.json > output.json and examine the output.json file.
Comment 46•12 years ago
|
||
Nathan, from where can i get histogram_tools package
Assignee | ||
Comment 47•12 years ago
|
||
(In reply to schintalapani from comment #46) > from where can i get histogram_tools package In the same directory where Histograms.json lives. You can either run python from that directory to pick it up automatically, or set the PYTHONPATH environment variable.
Comment 48•12 years ago
|
||
confused I only downloaded the code from attachments to this bug. Is there anything else I am missing?
Comment 49•12 years ago
|
||
specifically these two files https://bug748417.bugzilla.mozilla.org/attachment.cgi?id=618861 https://bug748417.bugzilla.mozilla.org/attachment.cgi?id=656585
Assignee | ||
Comment 50•12 years ago
|
||
You need this file: https://bug748417.bugzilla.mozilla.org/attachment.cgi?id=656585 In addition, you also need a checkout of mozilla-central or mozilla-inbound, to get access to histogram_tools.py and Histograms.json, both of which can be found in the toolkit/components/telemetry/ subdirectory of your checkout.
Comment 51•12 years ago
|
||
Made few changes to structure of final json output. Could you tell me which appVersion should this be applied to.
Assignee | ||
Updated•12 years ago
|
Attachment #658230 -
Attachment mime type: text/x-python → text/plain
Comment 52•12 years ago
|
||
Comment 53•12 years ago
|
||
Assignee | ||
Updated•12 years ago
|
Attachment #662218 -
Attachment mime type: text/x-python → text/plain
Assignee | ||
Comment 54•12 years ago
|
||
Here's a refactored version that's somewhat clearer to read, as well as correcting a bug with the handling of STARTUP_ histograms.
Attachment #618861 -
Attachment is obsolete: true
Attachment #656585 -
Attachment is obsolete: true
Attachment #658230 -
Attachment is obsolete: true
Attachment #662217 -
Attachment is obsolete: true
Attachment #662218 -
Attachment is obsolete: true
Attachment #656585 -
Flags: feedback?(schintalapani)
Assignee | ||
Comment 55•12 years ago
|
||
Here's an actual patch with the new script.
Attachment #664293 -
Attachment is obsolete: true
Attachment #665463 -
Flags: review?(taras.mozilla)
Reporter | ||
Comment 56•12 years ago
|
||
Comment on attachment 665463 [details] [diff] [review] patch + if histogram.low() == 0: + parameters['min'] = 1 + else: + parameters['min'] = histogram.low() what's going on here?
Assignee | ||
Comment 57•12 years ago
|
||
(In reply to Taras Glek (:taras) from comment #56) > + if histogram.low() == 0: > + parameters['min'] = 1 > + else: > + parameters['min'] = histogram.low() > > what's going on here? Oh, I bet this is leftover code from when we gave incorrect parameters for boolean and flag histograms, and the low value of 0 was screwing with something. This should just be: parameters['min'] = histogram.low()
Reporter | ||
Comment 58•12 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #57) > (In reply to Taras Glek (:taras) from comment #56) > > + if histogram.low() == 0: > > + parameters['min'] = 1 > > + else: > > + parameters['min'] = histogram.low() > > > > what's going on here? > > Oh, I bet this is leftover code from when we gave incorrect parameters for > boolean and flag histograms, and the low value of 0 was screwing with > something. This should just be: > > parameters['min'] = histogram.low() k, please submit a patch that doesn't merely test whether I'm reading it :)
Assignee | ||
Comment 59•12 years ago
|
||
Attachment #665463 -
Attachment is obsolete: true
Attachment #665463 -
Flags: review?(taras.mozilla)
Attachment #665743 -
Flags: review?(taras.mozilla)
Reporter | ||
Updated•12 years ago
|
Attachment #665743 -
Flags: review?(taras.mozilla) → review+
Assignee | ||
Comment 60•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/80ad55f342c9
Whiteboard: [leave open]
Assignee | ||
Comment 61•12 years ago
|
||
[Approval Request Comment] This uplift request is for the benefit of the metrics team, who uses these files to validate Telemetry data coming in. They can use the files from mozilla-central for helping to validate Telemetry pings from FF17, but it'd be more convenient, I think, to have the files in the correct tree. That's my understanding of the situation anyway; if Harsha or Daniel says it doesn't really matter, then let's drop this uplift request. Uplifting this means bug 789371 should also be uplifted; I will make that request separately and it's a small dependency in any event. Bug caused by (feature/regressing bug #): No bug. User impact if declined: None. Testing completed (on m-c, etc.): Tested on m-c for this release cycle. Risk to taking this patch (and alternatives if risky): Very low. The gen-histogram-bucket-changes.py file is not part of the build, and the changes to the build from the other files in the patch have been extensively tested this release cycle. String or UUID changes made by this patch: None.
Attachment #667170 -
Flags: review+
Attachment #667170 -
Flags: approval-mozilla-aurora?
Comment 62•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/80ad55f342c9
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Flags: in-testsuite-
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
Comment 63•12 years ago
|
||
Comment on attachment 667170 [details] [diff] [review] aurora uplift of histogram_tools.py and gen-histogram-bucket-ranges.py Please re-nominate when it's clear that this is necessary and bug 789371 is ready for approval as well.
Attachment #667170 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora-
You need to log in
before you can comment on or make changes to this bug.
Description
•