v1.0
breaking changes
-
as of v0.13.1-38-gb88c3b84 by default we reject data points with a timestamp far in the future.
By default the cutoff is at 10% of the raw retention's TTL, so for example with the default
storage schema1s:35d:10min:7
the cutoff is at35d*0.1=3.5d
.
The limit can be configured by using the parameterretention.future-tolerance-ratio
, or it can
be completely disabled by using the parameterretention.enforce-future-tolerance
.
To predict whether Metrictank would drop incoming data points once the enforcement is turned on,
the metrictank.sample-too-far-ahead
can be used, this metric counts the data points which
would be dropped if the enforcement were turned on while it is off.
#1572 -
Prometheus integration removal. As of v0.13.1-97-gd77c5a31, it is no longer possible to use metrictank
to scrape prometheus data, or query data via Promql. There was not enough usage (or customer interest)
to keep maintaining this functionality.
#1613 -
as of v0.13.1-110-g6b6f475a tag support is enabled by default, it can still be disabled though.
This means if previously metrics with tags have been ingested while tag support was disabled,
then those tags would have been treated as a normal part of the metric name, when tag support
now gets enabled due to this change then the tags would be treated as tags and they wouldn't
be part of the metric name anymore. As a result there is a very unlikely scenario in which some
queries don't return the same results as before, if they query for tags as part of the metric
name. (note: meta tags still disabled by default)
#1619 -
as of v0.13.1-186-gc75005d the
/tags/delSeries
no longer accepts apropagate
parameter.
It is no longer possible to send the request to only a single node, it now always propagates to all nodes, bringing this method in line with/metrics/delete
. -
as of v0.13.1-250-g21d1dcd1 (#951) metrictank no longer excessively aligns all data to the same
lowest comon multiple resolution, but rather keeps data at their native resolution when possible.- When queries request mixed resolution data, this will now typically result in larger response datasets,
with more points, and thus slower responses.
Though the max-points-per-req-soft and max-points-per-req-hard settings will still help curb this problem.
Note that the hard limit was previously not always applied correctly.
Queries may run into this limit (and error) when they did not before. - This version introduces 2 new optimizations (see pre-normalization and mdp-optimization settings).
The latter is experimental and disabled by default, but the former is recommended and enabled by default.
It helps with alleviating the extra cost of queries in certain cases
(See https://github.com/grafana/metrictank/blob/master/docs/render-path.md#pre-normalization for more details)
When upgrading a cluster in which you want to enable pre-normalization (recommended),
you must apply caution: pre-normalization requires a PNGroup property to be
communicated in intra-cluster data requests, which older peers don't have.
The peer receiving the client request, which fans out the query across the cluster, will only set
the flag if the optimization is enabled (and applicable). If the flag is set for the requests,
it will need the same flag set in the responses it receives from its peers in order to tie the data back to the initiating requests.
Otherwise, the data won't be included in the response, which may result in missing series, incorrect aggregates, etc.
Peers responding to a getdata request will include the field in the response, whether it has the
optimization enabled or not.
Thus, to upgrade an existing cluster, you have 2 options:
A) disable pre-normalization, do an in-place upgrade. enable it, do another in-place upgrade.
This works regardless of whether you have a separate query peers, and regardless of whether you first
upgrade query or shard nodes.
B) do a colored deployment: create a new gossip cluster that has the optimization enabled from the get-go,
then delete the older deployment. - When queries request mixed resolution data, this will now typically result in larger response datasets,
-
as of v0.13.1-384-g82dedf95 the meta record index configuration parameters have been moved out
of the sectioncassandra-idx
, they now have their own sectioncassandra-meta-record-idx
. -
as of v0.13.1-433-g4c801819, metrictank proxies bad requests to graphite.
though as of v0.13.1-577-g07eed80f this is configurable via thehttp.proxy-bad-requests
flag.
Leave enabled if your queries are in the grey zone (rejected by MT, tolerated by graphite),
disable if you don't like the additional latency.
The aspiration is to remove this entire feature once we work out any more kinks in metrictank's request validation. -
as of v0.13.1-788-g79e4709 (see: #1831) the option
reject-invalid-tags
was removed. Another option namedreject-invalid-input
was added to take its place, and the default value is set totrue
. This new option rejects invalid tags and invalid UTF8 data found in either the metric name or the tag key or tag value. The exported statinput.xx.metricdata.discarded.invalid_tag
was also changed toinput.xx.metricdata.discarded.invalid_input
, so dashboards will need to be updated accordingly.
index
- performance improvement meta tags #1541, #1542
- Meta tag support bigtable. #1646
- bugfix: return correct counts when deleting multiple tagged series. #1641
- fix: auto complete should not ignore meta tags if they are also metric tags. #1649
- fix: update cass/bt index when deleting tagged metrics. #1657
- fix various index bugs. #1664, #1667, #1748, #1766, #1833
- bigtable index fix: only load current metricdefs. #1564
- Fix deadlock when write queue full. #1569
fakemetrics
- filters. first filter is an "offset filter". #1762
- import 'schemasbackfill' mode. #1666
- carbon tag support. #1691
- add values policy. #1773
- configurable builders + "daily-sine" value policy. #1815
- add a "Containers" mode to fakemetrics with configurable churn #1859
other tools
- mt-gateway: new tool to receive data over http and save into kafka, for MT to consume. #1608, #1627, #1645
- mt-parrot: continuous validation by sending dummy stats and querying them back. #1680
- mt-whisper-importer-reader: print message when everything done with the final stats. #1617
new native processing functions
- aggregate() #1751
- aliasByMetric() #1755
- constantLine() #1734 (note: due to a yet undiagnosed bug, was disabled in #1783 )
- groupByNode, groupByNodes. #1753, #1774
- invert(). #1791
- minMax(). #1792
- offset() #1621
- removeEmptySeries() #1754
- round() #1719
- unique() #1745
other
- dashboard tweaks. #1557, #1618
- docs improvements #1559 , #1620, #1594, #1796
- tags/findSeries - add lastts-json format. #1580
- add catastrophe recovery for cassandra (re-resolve when all IP's have changed). #1579
- Add
/tags/terms
query to get counts of tag values #1582 - expr: be more lenient: allow quoted ints and floats #1622
- Replaced Macaron logger with custom logger enabling query statistics. #1634
- Logger middleware: support gzipped responses from graphite-web. #1693
- Fix error status codes. #1684
- Kafka ssl support. #1701
- Aggmetrics: track how long a GC() run takes and how many series deleted #1746
- only connect to peers that have non-null ip address. #1758
- Added a panic recovery middleware after the logger so that we know what the query was that triggered a panic. #1784
- asPercent: don't panic on empty input. #1788
- Return 499 http code instead of 502 when client disconnect during a render query with graphite proxying. #1821
- Deduplicate resolve series requests. #1794
- Deduplicate duplicate fetches #1855
- set points-return more accurately. #1835