Skip to content
This repository has been archived by the owner on Jun 30, 2018. It is now read-only.

When search response is extremely large logging takes a long time #155

Closed
redbeard opened this issue Nov 29, 2011 · 11 comments
Closed

When search response is extremely large logging takes a long time #155

redbeard opened this issue Nov 29, 2011 · 11 comments

Comments

@redbeard
Copy link
Contributor

This is an edge case which happens when the query response is extremely large due to large document content and the logger level is in debug mode.

Essentially the logger rendering the response content into separate lines takes long enough that it times out the request if originating from a Rails controller.

Tracked down to line 50 in logger.rb,
json.to_s.each_line { |line| content += "# #{line}" } unless json.to_s !~ /\S/

The call to .each_line takes a non-trivial amount of time when the response has hundreds of lines.

Suggested trivial patch:

index d8c0122..a84f6cc 100644
--- a/lib/tire/logger.rb
+++ b/lib/tire/logger.rb
@@ -47,7 +47,8 @@ module Tire
       content += " [#{status}]"
       content += " (#{took} msec)" if took
       content += "\n#\n" unless json.to_s !~ /\S/
-      json.to_s.each_line { |line| content += "# #{line}" } unless json.to_s !~ /\S/
+      line_no = 0
+      json.to_s.each_line { |line| content += "# #{line}"; line_no = line_no + 1; break if (line_no > 50) } unless json.to_s !~ /\S/
       content += "\n\n"
       write content
     end
@karmi
Copy link
Owner

karmi commented Nov 29, 2011

Hi, I understand the issue, but is this something which is expected by the end-user? How many lines is "too much"? 50, 100, 1000? Wouldn't it be better to don't use debug mode in this case?

@redbeard
Copy link
Contributor Author

If we had distinction between 'debug' and 'trace' levels, I would say do full print outs at 'trace' and 50 lines per request maximum for 'debug' level.

Right now, with large documents (think large Wikipedia articles with associated metadata) one can't use debug mode unless one sets _source = disabled in the index mapping as a controller request will time out even for 20 documents returned.

@karmi
Copy link
Owner

karmi commented Nov 30, 2011

I don't know. What I do know is that cutting the response being logged at 50 lines is a) arbitrary, b) too short for common use-cases. Is there any other way around this?

@redbeard
Copy link
Contributor Author

Perhaps a discrete 'log response body' configuration option for debug mode?

Alternatively, a time-based limit on the response body logging. Something obviously smells when the logging of a request takes over 2 minutes!

BTW, I believe logging the json.to_s wouldn't take as long -- the .each_line block is the time consuming operation.

@karmi
Copy link
Owner

karmi commented Dec 1, 2011

Tal, I really don't understand why this is a problem.

The debug level is mostly useful for inspecting responses when things don't go as expected, or, mainly, to post a recreation of some ES session to mailing list or IRC. (The each_line is there for formatting.)

I'm sorry but I'm closing this.

@karmi karmi closed this as completed Dec 1, 2011
@karmi
Copy link
Owner

karmi commented Dec 7, 2011

@redbeard: Tal, it just occured to me -- why don't you write your own custom logger, which limits the output, and pass it to Tire.config { logger MyCustomClass }? Is that a viable solution?

@redbeard
Copy link
Contributor Author

redbeard commented Dec 7, 2011

Great idea, I'll give it a try!

On 08/12/2011, at 6:34, Karel Minarik
[email protected]
wrote:

@redbeard: Tal, it just occured to me -- why don't you write your own custom logger, which limits the output, and pass it to Tire.config { logger MyCustomClass }? Is that a viable solution?


Reply to this email directly or view it on GitHub:
#155 (comment)

@bbonamin
Copy link
Contributor

Sadly we recently ran into the same problem.

The long time to log is caused by exactly that line:

json.to_s.each_line { |line| content += "# #{line}" } unless json.to_s !~ /\S/

The problem is using += to concatenate the string. "+=" is known to have very bad performance, because it instantiates a new object every time a new assignment occurs:

>> a = 'a'
=> "a"
>> a.object_id
=> 70292542636400
>> a += 'b'
=> "ab"
>> a.object_id
=> 70292542545600

The solution is to change the += for the "shovel" operator, which reuses the same object.
E.g. with a 20 thousand lines JSON response:

Before the change
99.74%  48.04 seconds String#each_line

After the change
74.91%  0.11 seconds String#each_line

I guess it's sad this library isn't mantained anymore, because we could patch it.

@karmi
Copy link
Owner

karmi commented Apr 11, 2014

@bbonamin What makes you think and say that the "library isn't maintained anymore"? Did you check the number of discussions going on in issues here, the amount or assistence given to people on IRC, and other things?

To simply repeat it again, this library is actively maintained, it has just been "retired" or "deprecated" in favour of the new suite of gems, which are superior in all aspects of software development.

I do agree that the particular part of code should absolutely use the << operator. On the other hand, I'm having a hard time imagining leaving the "trace" on when you're concerned about performance. It's a debugging tool and setting.

@bbonamin
Copy link
Contributor

@karmi Sorry if I offended you, that wasn't my intention. These type of comments made me think that this wasn't actively mantained anymore, but now that I've read your Re(Tire) announcement, I understand that fixing bugs is intended.

I agree that of course on production settings the trace should be disabled, but improving 100x the performance of the tracing should be simple, and could help people avoid these type of problems in the future, where you might be clueless as why your performance between development and production is so different.

Would you be willing to accept a pull request with these changes?

@karmi
Copy link
Owner

karmi commented Apr 11, 2014

@bbonamin No, you didn't offended me in any way, I just wanted to make the context clear.

In general, I'm against adding features to the gem, or "fixing" minor issues like the one you linked, that's just wasted energy.

I would 100% and absolutely merge a pull request for this, that's of course a plain oversight in the code, and the << should be of course used.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants