Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: 'utf-8' codec can't decode byte 0x9d in position 681: invalid start byte #164

Closed
InLineR495 opened this issue Mar 18, 2021 · 9 comments · Fixed by #173
Closed

Error: 'utf-8' codec can't decode byte 0x9d in position 681: invalid start byte #164

InLineR495 opened this issue Mar 18, 2021 · 9 comments · Fixed by #173
Assignees
Labels
bug Something isn't working

Comments

@InLineR495
Copy link

Starting restler with cmd:

restler test --grammar_file Compile/grammar.py --dictionary_file Compile/dict.json --token_refresh_interval 3500 --token_refresh_command "python3.8 /opt/api/some-service/token-some-service.py" --no_ssl

Some minutes later:

ERROR: Restler engine failed. See logs in /opt/api/admin-service/Test directory for more information.

But EngineStdErr is empty. The information in EngineStdOut:

Initializing: Garbage collection every 30 seconds.
Terminating garbage collection. Waiting for max 300 seconds.
'utf-8' codec can't decode byte 0x9d in position 681: invalid start byte

The information from main.txt:

Rendering request 14 from scratch

2021-03-18 11:09:59.824: Final Swagger spec coverage: 4 / 25
2021-03-18 11:09:59.824: Rendered requests: 0 / 25
2021-03-18 11:09:59.824: Rendered requests with "valid" status codes: 4 / 0
2021-03-18 11:09:59.824: Num fully valid requests (no resource creation failures): 4
2021-03-18 11:09:59.824: Num requests not rendered due to invalid sequence re-renders: 0
2021-03-18 11:09:59.824: Num invalid requests caused by failed resource creations: 0
2021-03-18 11:09:59.824: Total Creations of Dyn Objects: 0
2021-03-18 11:09:59.824: Total Requests Sent: {'gc': 0, 'main_driver': 71}
2021-03-18 11:09:59.824: Bug Buckets: {'main_driver_504': 1}

@marina-p
Copy link
Contributor

Hello @InLineR495,

Thanks for reporting this. Would you be able to post a sample response from your logs? It is probably the last "Received" in the network.testing file before RESTler exited.

Thanks,

Marina

@marina-p marina-p self-assigned this Mar 18, 2021
@marina-p marina-p added bug Something isn't working help wanted Pick this issue - it is ready to be worked on. labels Mar 18, 2021
@InLineR495
Copy link
Author

InLineR495 commented Mar 18, 2021

Hello @InLineR495,

Thanks for reporting this. Would you be able to post a sample response from your logs? It is probably the last "Received" in the network.testing file before RESTler exited.

Thanks,

Marina

Sending was last in logs:

Generation-1: Rendering Sequence-1

Request: 1 (Remaining candidate combinations: 1)
Request hash: 9b9dc8102bb0870e8fbc21d55535317e80801868

	- restler_static_string: 'GET '
	- restler_static_string: '/'
	- restler_static_string: 'votings'
	- restler_static_string: '/'
	- restler_static_string: 'result'
	- restler_static_string: ' HTTP/1.1\r\n'
	- restler_static_string: 'Accept: application/json\r\n'
	- restler_static_string: 'Host: example.com\r\n'
	+ restler_refreshable_authentication_token: ['token_refresh_cmd', 'token_refresh_interval']
	- restler_static_string: '\r\n'

2021-03-18 11:09:59.309: Sending: 'GET /votings/result HTTP/1.1\r\nAccept: application/json\r\nHost: example.com\r\n_OMITTED_AUTH_TOKEN_\r\nContent-Length: 0\r\nUser-Agent: restler/7.2.0\r\n\r\n'

Previous response:

2021-03-18 11:09:59.200: Received: 'HTTP/1.1 200 \r\nServer: nginx/1.16.1\r\nDate: Thu, 18 Mar 2021 11:08:28 GMT\r\nContent-Type: application/json;charset=UTF-8\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nvary: Origin\r\nvary: Access-Control-Request-Method\r\nvary: Access-Control-Request-Headers\r\nx-content-type-options: nosniff\r\nx-xss-protection: 1; mode=block\r\ncache-control: no-cache, no-store, max-age=0, must-revalidate\r\npragma: no-cache\r\nexpires: 0\r\nx-frame-options: DENY\r\nset-cookie: 09d214e18cbdacf6d05920025f665caf=d64266dd1b5f958a3edfaaa2128996e1; path=/; HttpOnly\r\n\r\nc2\r\n{"error":{"code":129,"description":"description here"}}\r\n0\r\n\r\n'

@stishkin stishkin self-assigned this Mar 31, 2021
@stishkin
Copy link
Contributor

@InLineR495
I added some logging in order to better understand where issue is happening:

#173

The code is located in this branch:
https://github.com/microsoft/restler-fuzzer/tree/stas/164

If you could run your test with my changes, and share the logs.

@stishkin stishkin removed the help wanted Pick this issue - it is ready to be worked on. label Mar 31, 2021
@InLineR495
Copy link
Author

InLineR495 commented Apr 2, 2021

@InLineR495
I added some logging in order to better understand where issue is happening:

#173

The code is located in this branch:
https://github.com/microsoft/restler-fuzzer/tree/stas/164

If you could run your test with my changes, and share the logs.

Here from network.testing (ommiting sensitive data):

2021-04-02 09:35:08.961: Failed to decode header data due to 'utf-8' codec can't decode byte 0xd0 in position 1493: invalid continuation byte when decoding b'HTTP/1.1 200 \r\nServer: nginx/X.XX\r\nDate: Fri, 02 Apr 2021 09:35:08 GMT\r\nContent-Type: application/json;charset=UTF-8\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nvary: Origin\r\nvary: Access-Control-Request-Method\r\nvary: Access-Control-Request-Headers\r\nset-cookie: abf3ed9a58c43deadc2a1b4aca570fda=c4685aad5e98f9969fd39702c736043d; path=/; HttpOnly\r\ncache-control: private\r\n\r\n451\r\n{"data":[{"id":"9c0cf2c3-d53d-4dc9-93b8-1a84807af922","vId":"26d50624-ce16-47d3-8ac1-5736dy895246","eId":"21231100000000","vName":"\xd0\x92\xd0\x92\xd0\x92\xd0\x92\xd0\x92\xd0\x92","name":"\xd0\x92\xd0\x92\xd0\x92\xd0\x92\xd0\x92\xd0\x92 111","lang":0,"mMar":1,"mType":"ON","q":[{"id":"4207a31a-0a61-43c1-9594-2cfb5ee4a39c","num":1,"eId":"21231100000000","sText":"\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98r\n'

I think it is connected some how with cyrillic symbols

@stishkin
Copy link
Contributor

stishkin commented Apr 2, 2021

@InLineR495

I cannot seem to be able to repro the issue with data you shared

s = b'HTTP/1.1 200 \r\nServer: nginx/X.XX\r\nDate: Fri, 02 Apr 2021 09:35:08 GMT\r\nContent-Type: application/json;charset=UTF-8\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nvary: Origin\r\nvary: Access-Control-Request-Method\r\nvary: Access-Control-Request-Headers\r\nset-cookie: abf3ed9a58c43deadc2a1b4aca570fda=c4685aad5e98f9969fd39702c736043d; path=/; HttpOnly\r\ncache-control: private\r\n\r\n451\r\n{"data":[{"id":"9c0cf2c3-d53d-4dc9-93b8-1a84807af922","vId":"26d50624-ce16-47d3-8ac1-5736dy895246","eId":"21231100000000","vName":"\xd0\x92\xd0\x92\xd0\x92\xd0\x92\xd0\x92\xd0\x92","name":"\xd0\x92\xd0\x92\xd0\x92\xd0\x92\xd0\x92\xd0\x92 111","lang":0,"mMar":1,"mType":"ON","q":[{"id":"4207a31a-0a61-43c1-9594-2cfb5ee4a39c","num":1,"eId":"21231100000000","sText":"\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98r\n'

s.decode('utf-8')
'HTTP/1.1 200 \r\nServer: nginx/X.XX\r\nDate: Fri, 02 Apr 2021 09:35:08 GMT\r\nContent-Type: application/json;charset=UTF-8\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nvary: Origin\r\nvary: Access-Control-Request-Method\r\nvary: Access-Control-Request-Headers\r\nset-cookie: abf3ed9a58c43deadc2a1b4aca570fda=c4685aad5e98f9969fd39702c736043d; path=/; HttpOnly\r\ncache-control: private\r\n\r\n451\r\n{"data":[{"id":"9c0cf2c3-d53d-4dc9-93b8-1a84807af922","vId":"26d50624-ce16-47d3-8ac1-5736dy895246","eId":"21231100000000","vName":"ВВВВВВ","name":"ВВВВВВ 111","lang":0,"mMar":1,"mType":"ON","q":[{"id":"4207a31a-0a61-43c1-9594-2cfb5ee4a39c","num":1,"eId":"21231100000000","sText":"ИИИИИИИИИИИИИИr\n'

According to Python documentation (https://docs.python.org/3/howto/unicode.html#the-unicode-type), we can attempt to ignore bad character, and proceed with receiving data. Or replace a character that causes a failure.

Also see:
https://wiki.python.org/moin/UnicodeDecodeError

@InLineR495
Copy link
Author

@InLineR495

I cannot seem to be able to repro the issue with data you shared

s = b'HTTP/1.1 200 \r\nServer: nginx/X.XX\r\nDate: Fri, 02 Apr 2021 09:35:08 GMT\r\nContent-Type: application/json;charset=UTF-8\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nvary: Origin\r\nvary: Access-Control-Request-Method\r\nvary: Access-Control-Request-Headers\r\nset-cookie: abf3ed9a58c43deadc2a1b4aca570fda=c4685aad5e98f9969fd39702c736043d; path=/; HttpOnly\r\ncache-control: private\r\n\r\n451\r\n{"data":[{"id":"9c0cf2c3-d53d-4dc9-93b8-1a84807af922","vId":"26d50624-ce16-47d3-8ac1-5736dy895246","eId":"21231100000000","vName":"\xd0\x92\xd0\x92\xd0\x92\xd0\x92\xd0\x92\xd0\x92","name":"\xd0\x92\xd0\x92\xd0\x92\xd0\x92\xd0\x92\xd0\x92 111","lang":0,"mMar":1,"mType":"ON","q":[{"id":"4207a31a-0a61-43c1-9594-2cfb5ee4a39c","num":1,"eId":"21231100000000","sText":"\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98\xd0\x98r\n'

s.decode('utf-8')
'HTTP/1.1 200 \r\nServer: nginx/X.XX\r\nDate: Fri, 02 Apr 2021 09:35:08 GMT\r\nContent-Type: application/json;charset=UTF-8\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nvary: Origin\r\nvary: Access-Control-Request-Method\r\nvary: Access-Control-Request-Headers\r\nset-cookie: abf3ed9a58c43deadc2a1b4aca570fda=c4685aad5e98f9969fd39702c736043d; path=/; HttpOnly\r\ncache-control: private\r\n\r\n451\r\n{"data":[{"id":"9c0cf2c3-d53d-4dc9-93b8-1a84807af922","vId":"26d50624-ce16-47d3-8ac1-5736dy895246","eId":"21231100000000","vName":"ВВВВВВ","name":"ВВВВВВ 111","lang":0,"mMar":1,"mType":"ON","q":[{"id":"4207a31a-0a61-43c1-9594-2cfb5ee4a39c","num":1,"eId":"21231100000000","sText":"ИИИИИИИИИИИИИИr\n'

According to Python documentation (https://docs.python.org/3/howto/unicode.html#the-unicode-type), we can attempt to ignore bad character, and proceed with receiving data. Or replace a character that causes a failure.

Also see:
https://wiki.python.org/moin/UnicodeDecodeError

It would be great to handle these types of exceptions and continue running

Also i had this today:

Initializing: Garbage collection every 30 seconds.
No checkpoints used at this phase
2021-04-02 19:01:36.390: Generation: 1
Terminating garbage collection. Waiting for max 300 seconds.
Traceback (most recent call last):
File "/opt/restler_bin/engine/engine/core/fuzzer.py", line 42, in run
self._num_total_sequences = driver.generate_sequences(
File "/opt/restler_bin/engine/engine/core/driver.py", line 652, in generate_sequences
seq_collection = render(seq_collection, fuzzing_pool, checkers, generation, global_lock)
File "/opt/restler_bin/engine/engine/core/driver.py", line 266, in render_sequential
valid_renderings = render_one(seq_collection, ith, checkers, generation, global_lock)
File "/opt/restler_bin/engine/engine/core/driver.py", line 198, in render_one
renderings = current_seq.render(candidate_values_pool, global_lock)
File "/opt/restler_bin/engine/engine/core/sequences.py", line 402, in render
response = request_utilities.send_request_data(rendered_data)
File "/opt/restler_bin/engine/engine/core/request_utilities.py", line 218, in send_request_data
success, response = sock.sendRecv(rendered_data,
File "/opt/restler_bin/engine/engine/transport_layer/messaging.py", line 86, in sendRecv
response = HttpResponse(self._recvResponse(req_timeout_sec))
File "/opt/restler_bin/engine/engine/transport_layer/messaging.py", line 229, in _recvResponse
data += buf.decode(UTF8)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 315: invalid continuation byte

@stishkin
Copy link
Contributor

stishkin commented Apr 2, 2021

@InLineR495 - I added "ignore" parameter when decoding fails. Can you give that change a try ?

https://github.com/microsoft/restler-fuzzer/pull/173/files

@InLineR495
Copy link
Author

@InLineR495 - I added "ignore" parameter when decoding fails. Can you give that change a try ?

https://github.com/microsoft/restler-fuzzer/pull/173/files

Now, it's working!
Can you merge it into main?

@marina-p
Copy link
Contributor

@InLineR495 We will get this merged next week, Stas is currently on vacation.

We're going to put it under an option for now, in case this parsing issue could indicate a bug in some cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants