-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replication with attachments never completes, {mp_parser_died,noproc} error #745
Comments
Seeing this now in multiple production environments. In one case, it is potentially completely freezing a node participating in continuous replication with large attachments. In the other, it's a one-time replication that must be restarted many times before it runs to completion. Discussion on IRC today with @nickva follows.
|
Hi everyone!! I think I have some more information on this issue in the form of a side effect. My setup is a small cluster, with just 3 nodes continuously replicating a few databases from another, bigger one. Only 3 databases out of all the ones being replicated hold attachments and, by chance, the same node is responsible for replicating the 3 of them. That node throws the described error quite often (a few thousand times per hour), depending on the speed at which documents are received. That particular node shows a continuous increment on the After connecting the Observer to that node, to see which processes are there I could see a lot of Also this node has quite many 'erlang:apply/2' processes in function 'couch_http_multipart:mp_parse_attrs/2'. I think there may be something preventing the processes from exiting and that's why they pile up until it freezes. Hope this helps. |
Hi Carlos, Was wondering how many attachment you have roughly and their approximate size distribution. How about large document or document ID lengths larger 4KB? And this is still CouchDB 2.1.0 like mentioned above? Also what are the values of these configuration parameters:
Note that the default request size for Basically trying see if this is an issue of target cluster rejecting requests because of some of those limits, or there is something else. |
Hi Nick, I've been reviewing some of the The documents' sizes are not big, at least the ones I've reviewed. The ones I have reviewed sizes' range from 4 to 60 Kb pdf and xls docs. I haven't iterated through all of them to compute the distribution you suggest. Is there an easier CouchDB way to get that overview on attachments size? I'm using CouchDB 2.1.0 here and about the configs they both should be the default ones as I haven't specified any of them. Checking the config values from |
Hi Carlos, Thanks for the additional info! So it seems like with retries they eventually finish. We'd still would rather not have these errors to start with... I doesn't seem like request/document/attachment size limits are not involved in this case. Now I am thinking perhaps it could be unreliable network connections or a large number of replications. In your setup how reliable is the network. Any chance there is intermittent connectivity issues, or high latency, maybe running out of sockets? Another question is how many replications are running at the same time, would there be more than 500 per cluster node? That's currently the max jobs value for scheduler and if there are more than that, scheduling replicator would stop some and start others as it cycles through them. Wondering if that is an issue. |
Hi Nick, So although I've definitely seen some replication errors pointing to a closed connection on the source from time to time, they are very sparse and I don't think we're affected by unreliable network either as source is a cluster hosted on Softlayer central-US I think and target is a cluster located in Europe-West region of Google Compute Engine. I think both platforms, while located far away in terms of distance, they have very reliable and strong network links. About the number of replications I don't think we're anywhere near that figure. I'm replicating on a three nodes cluster, each of them having 9 running replications. Regards |
We believe that we're seeing this in internal testing on single-node hosts doing local replication too. Our attachment sizes can be in the gigabytes. |
Hi @elistevens, Thanks for your report. Would you be able to make a short script to reproduce the issue. Or at least describe the steps in more details, for example something like: 1: clone couch at version X, erlang version Y, OS version Z etc, 2: build, 3: setup with these config parameters, 4: create 2 dbs, 5: populate with attachment of this size, ...). |
@calonso, Sorry for the delayed response. There a minor fix in that code in 2.1.1, would you be able to retry it with that latest version, to see if results in the same error? If you do upgrade, take a look a release notes regarding vm.args file and localhost vs 127.0.0.1 node names. |
Hi @nickva. We updated to 2.1.1 a while ago and unfortunately we keep seeing the same error... :( Thanks! |
@calonso, thanks for checking, it helps to know that. |
Bah, my earlier draft response got eaten by a browser shutdown. I don't have an easy repro script, sadly. We're seeing the issue under load during our test runs, but any single test seems to work fine when run in isolation. Our largest attachments are in the range of 100MB to 1GB. I know that's against recommended practices, but that wasn't clear when the bones of our architecture was laid down in ~2011. We are running 2.1.1 on Ubuntu, using the official .debs. |
Hey @nickva I spent time today on getting a repro for this, as it's affecting more and more people. Bear with me on the setup, it's a little involved. Set up a VM with the following parameters (I used VMWare Workstation):
Now you're ready to setup the test:
Repeat the above a few times - get the DB to 1GB or larger. You can increase 10 but at some point you'll run out of RAM, so be careful. This script creates sample docs with a few fields and a 50MB attachment full of random bytes. Now to run the test:
If the above succeeds, This produces a failure for me within 10 minutes. The command line returns:
and the logfile has errors identical to that in the original post above, and in #574. |
@wohali thanks for that test! I'll take a look at it when I get a chance. The |
Leaving it here as it might be relevant: A similar issue was noticed with someone using attachments in the 10MB range. One attachment had larger size around 50MB. Investigating on #couchdb-dev IRC channel, implicated the Also possibly related in that case was that there were intermittent network problems - nodes being connected and disconnected. |
FYI the intermittent network problems are not a prerequisite for this problem to surface. However, I think we are going in the right direction thinking this is related to incorrect attachment length calculation and/or incomplete network transfers. |
One other thing - there was a previous attempt at changing some of this behaviour that never landed that references some old JIRA tickets: |
@wohali thanks, it does help! |
(From discussion on IRC) This might be related to setting a lower max http request limit: Before that was defaulting to 4GB which what the code has, but default.ini file set it to 64MB so that became the value being used. Max request will limit will prevent larger attachments to replicate. Also 413 error is not always raised cleanly: (see #574 also referenced in the top description) To confirm if this is the cause or is affecting this issue at all, can try to bump:
To a higher value, one that's larger than a two or three times the largest attachment or document perhaps (to account for some overhead). |
Related to this, we also started to enforce http request limits more strictly: |
Moar data from affected nodes. I’ve listed process by current function count/group script (click to reveal)io:format("~p", [
lists:keysort(2,
maps:to_list(lists:foldl(
fun(Elm, Acc) ->
case Elm of
{M, F, A} ->
N = maps:get({M, F, A}, Acc, 0),
maps:put({M, F, A}, N + 1, Acc);
Else ->
Acc
end
end,
#{},
lists:map(
fun(Pid) ->
case process_info(Pid) of
undefined -> [];
Info -> proplists:get_value(current_function, Info)
end
end,
processes()
)
))
)
]) OutputOutput from an affected node: [{{code_server,loop,1},1},
{{couch_replicator_scheduler,stats_updater_loop,1},1},
{{cpu_sup,measurement_server_loop,1},1},
{{cpu_sup,port_server_loop,2},1},
{{erl_eval,do_apply,6},1},
{{erl_prim_loader,loop,3},1},
{{erlang,hibernate,3},1},
{{gen,do_call,4},1},
{{global,loop_the_locker,1},1},
{{global,loop_the_registrar,0},1},
{{inet_gethost_native,main_loop,1},1},
{{init,loop,1},1},
{{mem3_shards,'-start_changes_listener/1-fun-0-',1},1},
{{memsup,port_idle,1},1},
{{net_kernel,ticker_loop,2},1},
{{shell,shell_rep,4},1},
{{standard_error,server_loop,1},1},
{{user,server_loop,2},1},
{{couch_changes,wait_updated,3},2},
{{prim_inet,recv0,3},2},
{{dist_util,con_loop,9},3},
{{gen_event,fetch_msg,5},6},
{{couch_os_process,'-init/1-fun-0-',2},23},
{{application_master,loop_it,4},25},
{{application_master,main_loop,2},25},
{{prim_inet,accept0,2},29},
{{couch_httpd_multipart,mp_parse_atts,2},31},
{{fabric_db_update_listener,cleanup_monitor,3},210},
{{fabric_db_update_listener,wait_db_updated,1},210},
{{rexi_monitor,wait_monitors,1},210},
{{rexi_utils,process_message,6},212},
{{couch_event_listener,loop,2},412},
{{couch_httpd_multipart,maybe_send_data,1},881},
{{couch_doc,'-doc_from_multi_part_stream/4-fun-1-',1},912},
{{gen_server,loop,6},2816}] Unaffected node in the same cluster: [{{code_server,loop,1},1},
{{couch_ejson_compare,less,2},1},
{{couch_index_server,get_index,3},1},
{{couch_replicator_scheduler,stats_updater_loop,1},1},
{{cpu_sup,measurement_server_loop,1},1},
{{cpu_sup,port_server_loop,2},1},
{{erl_eval,do_apply,6},1},
{{erl_prim_loader,loop,3},1},
{{fabric_util,get_shard,4},1},
{{global,loop_the_locker,1},1},
{{global,loop_the_registrar,0},1},
{{inet_gethost_native,main_loop,1},1},
{{init,loop,1},1},
{{mem3_shards,'-start_changes_listener/1-fun-0-',1},1},
{{memsup,port_idle,1},1},
{{net_kernel,ticker_loop,2},1},
{{prim_inet,recv0,3},1},
{{shell,shell_rep,4},1},
{{standard_error,server_loop,1},1},
{{user,server_loop,2},1},
{{couch_changes,wait_updated,3},2},
{{dist_util,con_loop,9},3},
{{erlang,hibernate,3},3},
{{gen_event,fetch_msg,5},6},
{{rexi,wait_for_ack,2},8},
{{couch_os_process,'-init/1-fun-0-',2},16},
{{application_master,loop_it,4},25},
{{application_master,main_loop,2},25},
{{prim_inet,accept0,2},28},
{{couch_httpd_multipart,mp_parse_atts,2},37},
{{fabric_db_update_listener,wait_db_updated,1},112},
{{fabric_db_update_listener,cleanup_monitor,3},113},
{{rexi_monitor,wait_monitors,1},114},
{{rexi_utils,process_message,6},114},
{{couch_event_listener,loop,2},344},
{{couch_httpd_multipart,maybe_send_data,1},361},
{{couch_doc,'-doc_from_multi_part_stream/4-fun-1-',1},398},
{{gen_server,loop,6},10270}] Output from a cluster that doesn’t have attachments: {{code_server,loop,1},1},
{{cpu_sup,measurement_server_loop,1},1},
{{cpu_sup,port_server_loop,2},1},
{{erl_eval,do_apply,6},1},
{{erl_prim_loader,loop,3},1},
{{erts_code_purger,loop,0},1},
{{fabric_db_update_listener,cleanup_monitor,3},1},
{{fabric_db_update_listener,wait_db_updated,1},1},
{{global,loop_the_locker,1},1},
{{global,loop_the_registrar,0},1},
{{init,loop,1},1},
{{net_kernel,ticker_loop,2},1},
{{rexi_monitor,wait_monitors,1},1},
{{rexi_utils,process_message,6},1},
{{shell,shell_rep,4},1},
{{standard_error,server_loop,1},1},
{{timer,sleep,1},1},
{{user,server_loop,2},1},
{{dist_util,con_loop,2},3},
{{gen_event,fetch_msg,5},6},
{{couch_changes,wait_updated,3},10},
{{couch_event_listener,loop,2},19},
{{application_master,loop_it,4},24},
{{application_master,main_loop,2},24},
{{couch_os_process,'-init/1-fun-0-',2},32},
{{prim_inet,accept0,2},33},
{{erlang,hibernate,3},75},
{{gen_server,loop,6},2521}] |
Possible solution from davisp discussed on IRC: https://gist.github.com/davisp/27cd7ab54cdffeaa6e96590df4f988f9 |
In some cases the higher level code from `couch_replicator_api_wrap` needs to handle retries explicitly and cannot cope with retries happening in the lower level http client. In such cases it sets `retries = 0`. For example: https://github.com/apache/couchdb/blob/master/src/couch_replicator/src/couch_replicator_api_wrap.erl#L271-L275 The http client then should avoid unconditional retries and instead consult `retries` value. If `retries = 0`, it shouldn't retry and instead bubble the exception up to the caller. This bug was discovered when attachments were replicated to a target cluster and the target cluster's resources were constrainted. Since attachment `PUT` requests were made from the context of an open_revs `GET` request, `PUT` request timed out, and they would retry. However, because the retry didn't bubble up to the `open_revs` code, the second `PUT` request would die with a `noproc` error, since the old parser had exited by then. See issue #745 for more.
In some cases the higher level code from `couch_replicator_api_wrap` needs to handle retries explicitly and cannot cope with retries happening in the lower level http client. In such cases it sets `retries = 0`. For example: https://github.com/apache/couchdb/blob/master/src/couch_replicator/src/couch_replicator_api_wrap.erl#L271-L275 The http client then should avoid unconditional retries and instead consult `retries` value. If `retries = 0`, it shouldn't retry and instead bubble the exception up to the caller. This bug was discovered when attachments were replicated to a target cluster and the target cluster's resources were constrainted. Since attachment `PUT` requests were made from the context of an open_revs `GET` request, `PUT` request timed out, and they would retry. However, because the retry didn't bubble up to the `open_revs` code, the second `PUT` request would die with a `noproc` error, since the old parser had exited by then. See issue #745 for more.
In some cases the higher level code from `couch_replicator_api_wrap` needs to handle retries explicitly and cannot cope with retries happening in the lower level http client. In such cases it sets `retries = 0`. For example: https://github.com/apache/couchdb/blob/master/src/couch_replicator/src/couch_replicator_api_wrap.erl#L271-L275 The http client then should avoid unconditional retries and instead consult `retries` value. If `retries = 0`, it shouldn't retry and instead bubble the exception up to the caller. This bug was discovered when attachments were replicated to a target cluster and the target cluster's resources were constrainted. Since attachment `PUT` requests were made from the context of an open_revs `GET` request, `PUT` request timed out, and they would retry. However, because the retry didn't bubble up to the `open_revs` code, the second `PUT` request would die with a `noproc` error, since the old parser had exited by then. See issue #745 for more.
Occasionally it's possible to lose track of our RPC workers in the main multipart parsing code. This change monitors each worker process and then exits if all workers have exited before the parser considers itself finished. Fixes part of #745
Occasionally it's possible to lose track of our RPC workers in the main multipart parsing code. This change monitors each worker process and then exits if all workers have exited before the parser considers itself finished. Fixes part of #745
So in testing with a client, this no longer hangs/crashes/eats all the RAM, but it does still cause an issue where a too-large request body fails to transmit a document. The replicator thinks it HAS successfully transferred the document, and declares replication successful. A subsequent attempt to GET the document results in a 404. Here is a censored excerpt from the logfile of the situation:
Note that in extensive testing, this has only happened four times, so I'm not sure I can provide an easy reproducer here, but we'll keep at it. Couch was running at the info log level for this test, so I'm going bump it up to debug level and try the test again, hoping for a duplicate. |
If some requests fail with a 413 it's not surprising that it completes. It should bump the The question is why does that one request fail with a 413 to start with. Good call on debug logs. Also what are the are the doc sizes involved, how many revisions per document. Any attachments? Then what are the |
Yes attachments. you can see revisions per doc are low. in the log. Everything else is default. (Working around this by bumping the defaults is just sweeping the error under the rug...) Of course, we should be replicating the document. Why is it even getting a 413 in the first place? It's replicating a document from the same server to the same server, no settings have changed, surely it should be able to PUT a document it just did a GET of from itself. I believe this test (I didn't write it) runs in a loop, so the data is being replicated over and over from one database to the next on the same server. Finally, even with a bumped |
We're now hard-rejecting attachments greater than @nickva I have an excellent reproducible case:
/cc @janl |
Great repro Joan. I played with it and came up with this: The python script uses the standalone attachment API: The twist now is that the replicator uses multipart requests and not standalone attachment requests. Multipart requests are subject to the This leads to the observed behaviour that you can create an attachment in one db and can NOT replicate that attachment to another db on the same CouchDB node (or another node with the same Applying Say you create a doc with two attachments, with a length that is just under I haven’t checked this, but a conflicting doc with one attachment < This leads us to having to decide:
References: --- a/src/chttpd/src/chttpd_db.erl
+++ b/src/chttpd/src/chttpd_db.erl
@@ -1218,6 +1218,7 @@ db_attachment_req(#httpd{method=Method, user_ctx=Ctx}=Req, Db, DocId, FileNamePa
undefined -> <<"application/octet-stream">>;
CType -> list_to_binary(CType)
end,
+ couch_httpd:check_max_request_length(Req),
Data = fabric:att_receiver(Req, chttpd:body_length(Req)),
ContentLen = case couch_httpd:header_value(Req,"Content-Length") of
undefined -> undefined; |
Shorter repro that runs quickly, tests the 1 attachment > Look for the two instances of #!/bin/sh
COUCH=http://127.0.0.1:15984
INT=http://127.0.0.1:15986
DBA=$COUCH/db
DBB=$COUCH/dbb
# cleanup
curl -X DELETE $DBA
curl -X DELETE $DBB
# setup
curl -X PUT $DBA
curl -X PUT $DBB
# config
curl -X PUT $INT/_config/httpd/max_http_request_size -d '"1500"'
curl -X PUT $INT/_config/replicator/retries_per_request -d '"1"'
# create an att > max_http_request_size, should succeed
# 3000 here as not to run into _local checkpoint size limits
BODY3000=11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
curl -X PUT http://127.0.0.1:15984/db/doc/att --data-binary "$BODY3000" -Hcontent-type:application/octet-stream
# replicate, should suceed, but with one doc_write_failure
curl -X POST $COUCH/_replicate -d "{\"source\": \"$DBA\", \"target\": \"$DBB\"}" -H content-type:application/json
# create two atts, each < max_http_request_size, but att1+att2 > max_http_request_size
# cleanup
curl -X DELETE $DBA
curl -X DELETE $DBB
# setup
curl -X PUT $DBA
curl -X PUT $DBB
BODY1500=11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
REV=`curl -sX PUT http://127.0.0.1:15984/db/doc1/att --data-binary "$BODY1500" -Hcontent-type:application/octet-stream | cut -b 31-64`
curl -X PUT http://127.0.0.1:15984/db/doc1/att2?rev=$REV --data-binary "$BODY1500" -Hcontent-type:application/octet-stream
# replicate, should suceed, but with one doc_write_failure
curl -X POST $COUCH/_replicate -d "{\"source\": \"$DBA\", \"target\": \"$DBB\"}" -H content-type:application/json |
This reverts commit 4a73d03. Latest Mochiweb 2.17 might have helped a bit but after runnig `soak-eunit suites=couch_replicator_small_max_request_size_target` make it fail after 10-15 runs locally for me.
This reverts commit 4a73d03. Latest Mochiweb 2.17 might have helped a bit but after runnig `soak-eunit suites=couch_replicator_small_max_request_size_target` make it fail after 10-15 runs locally for me.
…"" This reverts commit ba624ea.
This reverts commit 4a73d03. Latest Mochiweb 2.17 might have helped a bit but after runnig `soak-eunit suites=couch_replicator_small_max_request_size_target` make it fail after 10-15 runs locally for me.
Expected Behavior
Replication of a DB with attachments into 2.1.0 should be successful.
Current Behavior
Replication crashes after a while with the following stack trace:
Replication restarts, the error repeats and replication never finishes.
Feels like an instance of #574 which we thought had been resolved.
Your Environment
The text was updated successfully, but these errors were encountered: