Luminosity's possible process preventing condition #432

ashemez · 2021-06-07T21:31:21Z

At the section below in luminosity.py spin_process function, there is a possible and noisy luminosity processing prevention I detected. This is probably not faced by environments which have been processing its anomalies without data loss. I previously loaded test metrics and removed most of the test metrics, and probably cleaned up some other metrics too, during Skyline processing. This causes luminosity process to stuck at this point since the metrics couldn't be found in the DB:

skyline/skyline/luminosity/luminosity.py

Lines 390 to 393 in 95d4a72

    
           if not correlated_metrics: 
        
               logger.info('no correlations found for %s anomaly id %s' % ( 
        
                   base_name, str(anomaly_id))) 
        
               return False

So this section always returns the spining process without setting the unprocessed found anomaly_id as processed, actually there is a data loss or cleanup somehow but anomaly_id left in the DB there. The reference anomaly_id is taken from this section actually:

skyline/skyline/luminosity/luminosity.py

Lines 668 to 681 in 95d4a72

    
           if not last_processed_anomaly_id: 
        
               query = 'SELECT id FROM luminosity WHERE id=(SELECT MAX(id) FROM luminosity) ORDER BY id DESC LIMIT 1' 
        
               results = None 
        
               try: 
        
                   results = mysql_select(skyline_app, query) 
        
               except: 
        
                   logger.error(traceback.format_exc()) 
        
                   logger.error('error :: MySQL quey failed - %s' % query) 
        
               if results: 
        
                   try: 
        
                       last_processed_anomaly_id = int(results[0][0]) 
        
                       logger.info('last_processed_anomaly_id found from DB - %s' % str(last_processed_anomaly_id)) 
        
                   except: 
        
                       logger.error(traceback.format_exc())

Due to data loss in terms of metrics we should alter this query: SELECT id FROM luminosity WHERE id=(SELECT MAX(id) FROM luminosity) ORDER BY id DESC LIMIT 1
On the other hand, even this query should simply be rewritten like since we don't need a subselect here: SELECT MAX(id) FROM luminosity.

But I am still uncomfortable of this query which is not properly handling the luminosity unprocessed anomaly ids. First of all, MAX(id) doesn't guarantee the latest unprocessed anomaly id as you know, because the sequence primary ids in a table can be obtained as previously deleted record ids, since they are available at that moment.

I tried this query after the above query in my environment and this works and doesn't stuck the spining process at the non-processable anomaly-id due to non-existing metrics:

now = int(time())
after = now - 600
query = 'SELECT id FROM anomalies WHERE id NOT IN (SELECT DISTINCT id FROM luminosity) AND anomaly_timestamp > \'%s\' ORDER BY anomaly_timestamp ASC LIMIT 1' % str(after)

The time range can be arrange to an optimum window but this is better for an ideal DB query. But I'm still uncomfortable with this condition WHERE id NOT IN (SELECT DISTINCT id FROM luminosity) since it will have a continuously increasing record count in the luminosity table and would cause a performance decrease in a very long run. It could be better to have a luminosity_processed flag in the anomaly table as well for this case and condition could be changed like this:

query = 'SELECT id FROM anomalies WHERE luminosity_processed=0 AND anomaly_timestamp > \'%s\' ORDER BY anomaly_t imestamp ASC LIMIT 1' % str(after)

The text was updated successfully, but these errors were encountered:

ashemez · 2021-06-20T14:39:40Z

@earthgecko I just pushed a new branch in my fork https://github.com/ashemez/skyline/tree/20210620-luminosity-mysql-check could you please review that? I removed other MySQL checks.

earthgecko added good_times bad_times bug labels Jun 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Luminosity's possible process preventing condition #432

Luminosity's possible process preventing condition #432

ashemez commented Jun 7, 2021 •

edited

Loading

ashemez commented Jun 20, 2021

Luminosity's possible process preventing condition #432

Luminosity's possible process preventing condition #432

Comments

ashemez commented Jun 7, 2021 • edited Loading

ashemez commented Jun 20, 2021

ashemez commented Jun 7, 2021 •

edited

Loading