Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

examples/ota: add documentation on how to deliver update via ethos #15

Open
wants to merge 1 commit into
base: wip/rebase/ota_work_branch
Choose a base branch
from

Conversation

fedepell
Copy link

Contribution description

Add documentation on how the OTA procedure can be tested delivering the update via serial interface using the ethos driver

Issues/PRs references

See comments in RIOT-OS#9969

Replaces #12 as that became a bit messy because of rebasing :)

@kYc0o
Copy link
Owner

kYc0o commented Oct 31, 2018

Answering here to your questions:

vagrant@vagrant:~$ ifconfig tap0
tap0      Link encap:Ethernet  HWaddr d2:86:c8:f2:26:b3  
          inet6 addr: fe80::d086:c8ff:fef2:26b3/64 Scope:Link
          inet6 addr: fe80::1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:3 errors:0 dropped:0 overruns:0 frame:0
          TX packets:15 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:186 (186.0 B)  TX bytes:1642 (1.6 KB)
vagrant@vagrant:~$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.0.2.2        0.0.0.0         UG    0      0        0 enp0s3
10.0.2.0        0.0.0.0         255.255.255.0   U     0      0        0 enp0s3
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0

Apparently I have no routes for ipv6 addresses. Did you run some of the tuntap scripts provided in RIOT?

@kYc0o
Copy link
Owner

kYc0o commented Oct 31, 2018

For completeness:

vagrant@vagrant:~/RIOT$ sudo sh dist/tools/ethos/start_network.sh /dev/ttyACM0 tap0 2001:db8::/64
net.ipv6.conf.tap0.forwarding = 1
net.ipv6.conf.tap0.accept_ra = 0
----> ethos: sending hello.
----> ethos: activating serial pass through.
----> ethos: hello reply received
----> ethos: hello received
main(): This is RIOT! (Version: 2018.07-devel-2297-g359ee-HEAD)
RIOT OTA update over CoAP example application
firmware: running from slot 1
Firmware magic_number: 0x544f4952
Firmware Version: 0x40920186
Firmware start address: 0x00001100
Firmware APPID: 0
Firmware Size: 66940
Firmware HASH: 52 43 83 1a d8 d4 d5 b0 29 db 39 c1 14 9d f0 d5 2f da 19 bf 77 f1 3d a5 14 f3 5d a3 cd 38 ed b4 
Firmware chksum: 0x4749f0b9
Firmware signature: b9 1c 6b 32 84 fe e5 ce fb fd 86 a1 90 5e 31 c0 cf a1 9e 81 07 a5 c7 da d7 ea bb 12 b3 09 ce 48 fd 29 2d 60 d6 67 b4 fc ff 7f e5 9a 81 d8 be a9 96 05 33 87 50 3b 72 50 32 52 f5 3e e6 bf aa 06 
Waiting for address autoconfiguration...
Configured network interfaces:
Iface  6  HWaddr: 25:A2  Channel: 26  Page: 0  NID: 0x23
          Long HWaddr: 79:67:37:4E:0E:B8:A5:A2 
           TX-Power: 0dBm  State: IDLE  max. Retrans.: 3  CSMA Retries: 4 
          AUTOACK  ACK_REQ  CSMA  MTU:1280  HL:64  IPHC  
          Source address length: 8
          Link type: wireless
          inet6 addr: fe80::7b67:374e:eb8:a5a2  scope: local  VAL
          inet6 group: ff02::1
          
Iface  7  HWaddr: 00:2E:CE:65:09:42 
          MTU:1500  HL:64  Source address length: 6
          Link type: wired
          inet6 addr: fe80::22e:ceff:fe65:942  scope: local
          inet6 group: ff02::1

@kYc0o
Copy link
Owner

kYc0o commented Oct 31, 2018

Sorry, for IPv6 the command is different:

vagrant@vagrant:~$ route -n6
Kernel IPv6 routing table
Destination                    Next Hop                   Flag Met Ref Use If
2001:db8::/64                  fe80::2                    UG   1024 0     0 tap0
fd00:dead:beef::1/128          ::                         !n   256 0     0 lo
fe80::/64                      ::                         U    256 0     0 enp0s3
fe80::/64                      ::                         U    256 1     5 tap0
::/0                           ::                         !n   -1  1   110 lo
::1/128                        ::                         Un   0   3    23 lo
fd00:dead:beef::1/128          ::                         Un   0   1     0 lo
fe80::/128                     ::                         Un   0   1     0 lo
fe80::1/128                    ::                         Un   0   1     0 lo
fe80::a00:27ff:fed1:2804/128   ::                         Un   0   1     0 lo
fe80::d086:c8ff:fef2:26b3/128  ::                         Un   0   2    12 lo
ff00::/8                       ::                         U    256 0     0 enp0s3
ff00::/8                       ::                         U    256 1     3 tap0
::/0                           ::                         !n   -1  1   110 lo

@fedepell
Copy link
Author

Hmm looks quite correct and similar to mine. The only important difference being that your device has 2 network interfaces (Iface 6 is the radio, Iface 7 the ethos). Could you try to temporarily disable the 6 one?

@fedepell
Copy link
Author

My setup:

Waiting for address autoconfiguration...
Configured network interfaces:
Iface  5  HWaddr: 02:05:d6:16:21:31 
          MTU:1500  HL:64  Source address length: 6
          Link type: wired
          inet6 addr: fe80::5:d6ff:fe16:2131  scope: local  VAL
          inet6 group: ff02::1
          inet6 group: ff02::1:ff16:2131

Pc side:

$ ifconfig tap0
 tap0      Link encap:Ethernet  HWaddr 5a:43:14:70:2a:d2  
           inet6 addr: fe80::5843:14ff:fe70:2ad2/64 Scope:Link
           inet6 addr: fe80::1/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:15 errors:0 dropped:1 overruns:0 frame:0
           TX packets:42 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000 
           RX bytes:1401 (1.4 KB)  TX bytes:7417 (7.4 KB)
 
$ route -n6
 Kernel IPv6 routing table
 Destination                    Next Hop                   Flag Met Ref Use If
 2001:db8::/64                  fe80::2                    UG   1024 0     0 tap0
 2001:4c50:16c:7f00::/64        ::                         UAe  256 0     0 eth0
 fd00:dead:beef::1/128          ::                         !n   256 0     0 lo
 fe80::/64                      ::                         U    256 0     0 eth0
 fe80::/64                      ::                         U    256 3    10 tap0
 ::/0                           fe80::5a23:8cff:fe19:7124  UGDAe 1024 2    17 eth0
 ::/0                           ::                         !n   -1  1    85 lo
 ::1/128                        ::                         Un   0   5    76 lo
 2001:4c50:16c:7f00:21f:16ff:fefa:15d3/128 ::                         Un   0   3    11 lo
 fd00:dead:beef::1/128          ::                         Un   0   1     0 lo
 fe80::/128                     ::                         Un   0   1     0 lo
 fe80::1/128                    ::                         Un   0   1     0 lo
 fe80::21f:16ff:fefa:15d3/128   ::                         Un   0   2     2 lo
 fe80::5843:14ff:fe70:2ad2/128  ::                         Un   0   5     8 lo
 ff00::/8                       ::                         U    256 4   169 eth0
 ff00::/8                       ::                         U    256 3    30 tap0
 ::/0                           ::                         !n   -1  1    85 lo

$ ping6 fe80::5:d6ff:fe16:2131%tap0
PING fe80::5:d6ff:fe16:2131%tap0(fe80::5:d6ff:fe16:2131) 56 data bytes
64 bytes from fe80::5:d6ff:fe16:2131: icmp_seq=1 ttl=64 time=23.1 ms
64 bytes from fe80::5:d6ff:fe16:2131: icmp_seq=2 ttl=64 time=23.0 ms
^C
--- fe80::5:d6ff:fe16:2131%tap0 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 23.080/23.110/23.141/0.155 ms

@kYc0o
Copy link
Owner

kYc0o commented Oct 31, 2018

Yep, that was the trick... However we need then to be able to specify which interface is configured to do the tunneling, usually ethos is used to set up a border router, so the default routing goes through the wireless interface.

@fedepell
Copy link
Author

Sorry, my board (the saml21-xpro) does not have the radio that is why I did not notice that. I will try to check if I can find how it can be forced eventually.
As a side comment I think the ethos is interesting for cases like mine when you do not have another network interface and could therefore still fit the need, but it would be of course better if we find a way to force it (in this case is on RIOT side to investigate)

@kYc0o
Copy link
Owner

kYc0o commented Oct 31, 2018

Yes, definitely.

Back to the main topic, it seems something is not working...

vagrant@vagrant:~/RIOT$ sudo sh dist/tools/ethos/start_network.sh /dev/ttyACM0 tap0 2001:db8::/64
net.ipv6.conf.tap0.forwarding = 1
net.ipv6.conf.tap0.accept_ra = 0
----> ethos: sending hello.
----> ethos: activating serial pass through.
----> ethos: hello reply received
----> ethos: hello received
main(): This is RIOT! (Version: 2018.10-devel-849-g2e0d9-vagrant-HEAD)
RIOT OTA update over CoAP example application
firmware: running from slot 1
Firmware magic_number: 0x544f4952
Firmware Version: 0x41005359
Firmware start address: 0x00001100
Firmware APPID: 0
Firmware Size: 54752
Firmware HASH: 66 80 ca 04 67 5e 1c 2d 4a f1 e3 61 42 10 77 c5 7e 67 c2 c9 f6 5b 1c 09 ad 01 c6 b7 60 03 eb eb 
Firmware chksum: 0x8fe042fb
Firmware signature: 58 b5 34 5c d9 72 c0 45 5f 30 95 42 22 af 38 63 33 4e 58 76 77 a8 b7 7d a8 86 1a 9b 03 d9 aa 3f c3 3b 66 b3 86 6a 4c 78 eb 8e d4 da 79 3f 6e d6 f1 2d 6a b3 46 f8 18 a2 15 8c 06 9e 42 12 d9 0c 
Waiting for address autoconfiguration...
Configured network interfaces:
Iface  5  HWaddr: 00:2e:ce:65:09:42 
          MTU:1500  HL:64  Source address length: 6
          Link type: wired
          inet6 addr: fe80::22e:ceff:fe65:942  scope: local  VAL
          inet6 group: ff02::1
          inet6 group: ff02::1:ff65:942
          
ota: received bytes 0-64
ota: initializing update to target slot 2
ota: received bytes 0-64
ota: initializing update to target slot 2
ota: received bytes 64-128
ota: received bytes 128-192
ota: received bytes 192-256
ota: verifying metadata ...
ota: verification successful
ota: received bytes 192-256 of 55008 (left=54752)
coap_ota_handler(): ignoring already received block
ota: received bytes 256-320 of 55008 (left=54688)
ota: processing bytes 256-319 page 521-0x20900
ota: received bytes 320-384 of 55008 (left=54624)
.
.
.
.
coap_ota_handler(): ignoring already received block
ota: received bytes 54912-54976 of 55008 (left=32)
coap_ota_handler(): ignoring already received block
ota: received bytes 54976-55008 of 55008 (left=0)
coap_ota_handler(): ignoring already received block
ota: image hash incorrect!

Do you also have lots of coap_ota_handler(): ignoring already received block?

@fedepell
Copy link
Author

fedepell commented Nov 1, 2018

Yes I had many of that "ignoring" prints, but it always succeded at the end. I supposed it was due to slow link and udp timeouts.
I will check now with the latest git version to see how it behaves on my board (my last tests with this were on 5th October when I opened #12) and report!

@fedepell
Copy link
Author

fedepell commented Nov 1, 2018

I have now the same problems too and after some research I noticed that just a first part of flash is written. To me it was working because by luck (actually unluck) the flash already had the image from manual run before and if you rebuild with just APP_VER modified it will by chance have the same image. If I always totally erase the flash beforehand I have the same problems as you describe.
Sorry for not noticing before :( I will now try to figure out what is going on.

@fedepell
Copy link
Author

fedepell commented Nov 1, 2018

The "ignoring already received block" is erroneous, actually the block is not "already received" but "out of sequence". Indeed if you look:

ota: processing bytes 384-447 page 17-0x1100
ota: received bytes 448-512 of 54464 (left=53952)
ota: processing bytes 448-511 page 17-0x1100
ota: received bytes 512-576 of 54464 (left=53888)
ota: processing bytes 512-575 page 18-0x1200
**ota: received bytes 640-704 of 54464 (left=53760)**
coap_ota_handler(): ignoring already received block
ota: received bytes 704-768 of 54464 (left=53696)
coap_ota_handler(): ignoring already received block

You see there is a packet (576 to 640) that gets lost! then after that it will not write anything to flash as it is out of sequence. But it is not "already received" but just out of sequence. Indeed the check in the code is:

        if (block1.offset == _state.writer.offset) {
            res = firmware_simple_putbytes(&_state,
                    pkt->payload, pkt->payload_len);
        }
        else {
            LOG_INFO("coap_ota_handler(): ignoring already received block\n");
            res = 0;
        }

We go in the "else" but not because we already received, but just because it is != because it is different, in this case block1.offset==640 while _state.writer.offset=576

So now the main problem is that a packet is sometimes lost.

We could also of course do a better logging there.

@fedepell
Copy link
Author

fedepell commented Nov 1, 2018

Just an update: if you use COAP packet size (-b option) of 16 or 32 bytes, so slow down the transfer, then it works correctly! (can you confirm in case please?)

I will check now in Ethos/COAP buffer sizes, must be getting lost there.

@fedepell fedepell force-pushed the ota_contrib_ethosdoc_2 branch from e25454a to 6dea07b Compare November 1, 2018 07:49
@fedepell
Copy link
Author

fedepell commented Nov 1, 2018

I updated the doc with the smaller size on the command line.

I'm still unsure where the packet gets lost: ETHOS has a 2k buffer, STDIO UART 64 (tried to make it bigger, didn't change). Maybe on lower level indeed. I'll try to investigate further in case it may be a generic problem even if using a smaller COAP packet solves here the issue.

I would still propose to change a bit the coap_ota_handler handling of erroneous packets as by previous comment.

@kYc0o
Copy link
Owner

kYc0o commented Nov 1, 2018

@bergzand can you take a look to the coap_ota_handler? Are you using the same for block2 transfers?

@kYc0o
Copy link
Owner

kYc0o commented Nov 1, 2018

@fedepell unfortunately it didn't work neither with blocks of 32 nor 16 bytes :-(

I'll try to erase completely my device and test again. You're testing this branch isn't it?

@fedepell
Copy link
Author

fedepell commented Nov 2, 2018

@kYc0o : very sorry for that :( Have no idea what's wrong then. For sure we have a different board but still would expect to work.
Yes I'm on the correct branch:

$ git status
HEAD detached at kYc0o/wip/rebase/ota_work_branch

commit 359ee2e78be7a128dffc6d5e617dc6624f214426

@fedepell
Copy link
Author

fedepell commented Nov 5, 2018

I'm trying to debug the issue this morning.

I created a fake firmware with all zeros, except the first byte of each 64 block with an integer growing, so I can track exactly the packets. I added some logging code in the COAP ota handler.
I can see via wireshark that all the packets are correctly sent by coap-client, but always 1 is lost on RIOT side (the lines enclosed in [...] are my comments).

ota: writer offset 576 (first byte 82)ota: received bytes 0-64
ota: initializing update to target slot 1
ota: writer offset 64 (first byte 53)ota: received bytes 64-128
ota: writer offset 128 (first byte 9)ota: received bytes 128-192
ota: writer offset 192 (first byte 0)ota: received bytes 192-256
ota: verifying metadata ...

[here there is a quite long pause since verification is done. During this pause 
coap-client will resend 3 times the next packet as it gets no immediate 
response from RIOT side]

ota: verification successful
ota: writer offset 256 (first byte 0)ota: received bytes 192-256 of 54512 (left=54256)
coap_ota_handler(): ignoring already received block
ota: writer offset 256 (first byte 0)ota: received bytes 192-256 of 54512 (left=54256)
coap_ota_handler(): ignoring already received block

[as you see two of the packets are ignored since indeed the same packet is 
received 3 times as confirmed by wireshark. Now follows the other packets, 
you see that I print "first byte" is the integer I said I added as a counter. 
You can see that the first packets are correctly received and written to flash]

ota: writer offset 256 (first byte 0)ota: received bytes 256-320 of 54512 (left=54192)
ota: processing bytes 256-319 page 17-0x1100
ota: writer offset 320 (first byte 1)ota: received bytes 320-384 of 54512 (left=54128)
ota: processing bytes 320-383 page 17-0x1100
ota: writer offset 384 (first byte 2)ota: received bytes 384-448 of 54512 (left=54064)
ota: processing bytes 384-447 page 17-0x1100
ota: writer offset 448 (first byte 3)ota: received bytes 448-512 of 54512 (left=54000)
ota: processing bytes 448-511 page 17-0x1100
ota: writer offset 512 (first byte 4)ota: received bytes 512-576 of 54512 (left=53936)
ota: processing bytes 512-575 page 18-0x1200

[and here you see we go from packet 4 to packet 6, one packet is lost, but 
I can see it clearly on wireshark, and also on coap-client verbose output, 
that it was indeed sent. I also see on wireshark that the packet is acknowledged
by the COAP layer so coap-client will continue with the next one.
so a packet is someplace lost between the coap layer and ota coap handler and
then the whole process is ruined]

ota: writer offset 576 (first byte 6)ota: received bytes 640-704 of 54512 (left=53808)
coap_ota_handler(): ignoring already received block
ota: writer offset 576 (first byte 7)ota: received bytes 704-768 of 54512 (left=53744)

Now one very interesting thing: I tried to enable the debug in nanocoap.c to trace the lost packet with:

#define ENABLE_DEBUG (1)

And this tells me much more information. But was is very interesting is.... with debug enabled there are no more packet lost! (@kYc0o maybe you can cofirm out of curiosity if happens the same on your side?).

Actually just putting a single line to trace:

printf("Calling ota_resource_handler (%d)\n",pkt->payload[0]);

in nanocoap.c just before:

return resource->handler(pkt, resp_buf, resp_buf_len, resource->context);

Will mask the problem

Now I will try to continue to find what changes without debugging...

@fedepell
Copy link
Author

fedepell commented Nov 5, 2018

Another maybe interesting note: setting the serial speed to 9600 (you have to do it both in Makefile of ota application and in the start_network.sh script) will also mask down the problem.
It's definitely a timing issue, searching more...

@fedepell
Copy link
Author

fedepell commented Nov 5, 2018

It seems the packet is corrupted on the serial/ethos side.

A sane packet of the sequence (for 64 bytes payload data) is 172 bytes long, for example (see the 04 just before the long sequence of zeros is my counter growing):

ethos _end_of_frame: (len=172)
Packet recv len=172
02 05 d6 16 21 31 8e 9b 60 15 f3 5f 86 dd 60 09 c6 a1 00 76
11 40 fe 80 00 00 00 00 00 00 8c 9b 60 ff fe 15 f3 5f fe 80
00 00 00 00 00 00 00 05 d6 ff fe 16 21 31 8d 72 16 33 00 76
aa 90 40 03 73 72 3d 0e 66 65 38 30 3a 3a 35 3a 64 36 66 66
3a 66 65 31 36 3a 32 31 33 31 25 74 61 70 30 88 66 69 72 6d
77 61 72 65 d1 03 8a ff 04 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00

While the packet that gets "lost" is shorter, sometimes 1 byte sometimes more, for example in this run I got it 170:

ethos _end_of_frame: (len=170)
Packet recv len=170
02 05 d6 16 21 31 8e 9b 60 15 f3 5f 86 dd 60 09 c6 a1 00 76
11 40 fe 80 00 00 00 00 00 00 8c 9b 60 ff fe 15 f3 5f fe 80
00 00 00 00 00 00 00 05 d6 ff fe 16 21 31 8d 72 16 33 00 76
99 8f 40 03 73 73 3d 0e 66 65 38 30 3a 3a 35 3a 64 36 66 66
3a 66 65 31 36 3a 32 31 33 31 25 74 61 70 30 88 66 69 72 6d
77 61 72 65 d1 03 9a ff 05 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00

This packet is then discarded and you get the out of sequence problem.

Of course the main issue to address now would be to see why this is lost on serial and I will check more (but not today anymore :( ). Of course the fact that slower baud rate / packet size make the problem go away or less visibile makes also sense.

But generic question: don't we anyhow address the possibility of a COAP UDP packet being lost (as in this case) and ask to retry the sending? I think this is important when developing an OTA update. Now it seems to me that if a packet is lost the process will collapse (actually continue going on discarding packets).

@fedepell
Copy link
Author

fedepell commented Nov 5, 2018

Just as additional info the screenshot from wireshark session. You can see packet number 10 is the one that gets lost. You can see that there is no ACK/Continue for it (goes from 9 to 11) and there is no trial to restore or, at that point, just to close the whole session which is anyway useless at that point.

wireshark

@kYc0o
Copy link
Owner

kYc0o commented Nov 5, 2018

Wow! Thank you so much for the detailed information, it's very useful.

Now, that makes me think that there might be some bugs in the nanocoap block1 implementation, since it wasn't thoroughly tested. I'd like to have advice from @kaspar030 and @kb2ma who are more familiar with the code. Do you guys see something strange?

I'll make more tests on my side taking into account your suggestions, and also try to capture some traces from wireshark.

Thank you again for all the effort!

@kb2ma
Copy link

kb2ma commented Nov 5, 2018

Let me make sure I understand the problem. The coap-client (coap-cli?) process is pushing firmware blockwise (block1) to a nanocoap server implemented in sys/net/application_layer/ota/coap.c, right? How is coap-client implemented?

One or more packets from coap-client are lost on the way to nanocoap. However, neither coap-client nor coap.c do anything about it. I see two problems.

  1. coap-client must resend a confirmable message for which it does not get a response.
  2. nanocoap should at least recognize when it receives blocks out of order. Beyond that it either should make sure it receives all blocks or else reject the update when it does not.

Let me know if I've missed something, and we can take it from there.

@fedepell
Copy link
Author

fedepell commented Nov 5, 2018

@kb2ma: Thanks a lot for the feedback.

I'm no way an expert in this but I'll try to explain what I can see:

  1. Indeed coap-client run as by README should by default send confirmable messages I suppose (there is a "-N Send NON-confirmable message" but this is not used) but as far as I see in the wireshark log there is no sign of the message that is lost being resent.
    What is interesting is that I see that some message before is resent when the device is not responding since is doing the verification that takes a lot of time. From the wireshark screenshot posted I would not expect the comunication to proceed correctly to packet 11 if packet 10 never gets an acknowledge indeed, but it does continue.

  2. If the ota_coap_handler receives a packet out of order it will just ignore it but not return an error to coap:

        if (block1.offset == _state.writer.offset) {
            res = firmware_simple_putbytes(&_state,
                    pkt->payload, pkt->payload_len);
        }
        else {
            LOG_INFO("coap_ota_handler(): ignoring already received block\n");
            res = 0;
        }

Now if the packet is indeed "already received" (so a duplicate) I can agree it can be just ignored, but as it is written now also if it is a future packet (so we are missing one as in the described case) it will now be ignored.
I guess this packet rejection logic has to be split into three: right offset, offset from old packet (duplicate, can be ignored) and offset too far in the future (we have a hole, we have to do something to recover, or at worse abort the whole operation, as all the future will for sure be out of order)

@kYc0o
Copy link
Owner

kYc0o commented Nov 5, 2018

How is coap-client implemented?

It's libcoap. In my case in the develop branch with the latest commit.

@kb2ma I asked for your knowledge if something is going wrong on the coap or nanocoap side, since I'm aware you're not familiar with the code in ota/coap.c. Now that @fedepell explains, I guess the problem is on that part. Anyways if you have some hints it would be highly appreciated!

@kb2ma
Copy link

kb2ma commented Nov 5, 2018

OK, just "@" me if you have specific questions. It sounds like @fedepell is on the right track. On the RIOT server side it looks like it's an application issue to track when it does not receive the next expected block -- or maybe there is some CRC/hash validation at the end.

I'm confused on the client side though. libcoap recognizes when it does not get a response to the verification packet (due to the delay) and retries. However, it does not seem to recognize when it does not get a response to the missing packet (#10 in the screenshot), and doesn't retry. Maybe libcoap is not looking at the block number in the response.

Also, I'm surprised the request/response block numbers don't match up in the Wireshark screenshot. I would expect to see the ACK for block #10 right after the CON for block #10. Maybe nanocoap also responds to all of the retries for the delayed verification block? I can't see enough of the Wireshark output to say.

If you want to forward the Wireshark pcap I can take a closer look and compare to RFC 7959.

@fedepell
Copy link
Author

fedepell commented Nov 5, 2018

@kb2ma : thanks a lot for your help!

Please find attached the pcap. You will notice that at packet 3 there is no response for some time so coap-client will resend the packets. In that time RIOT is doing signature verification so it's stopped for some time as this process is rather slow, so coap-client is resending.
You can see that the response packet number desync is done there: RIOT will send the ACKs to number 3 a few times later (although it's discarding the packets as duplicates) and coap-client will continue with the packets even if there is, as you mention as well, now some kind of desync. This desync looks strange to me (and looks like a bug also on coap-client side?)

ota-update-coap.zip

(I had to zip the pcapng file as GH will not support that extension)

Just a side note to help following the packets: except the first packet which contains a update information and second that contains the header data, all the other are filled with zeros in data except the first byte which is an integer growing each time. I did this to simplify my tracking of lost data over the channel, so don't worry if the update looks quite "zeroish" ;)

@kb2ma
Copy link

kb2ma commented Nov 5, 2018

Also pinging @bergzand for any ideas since he implemented block2.

@kaspar030
Copy link

Somehow this rings a bell. @bergzand didn't we see this before?

@kb2ma
Copy link

kb2ma commented Nov 6, 2018

Thanks for the pcap, @fedepell. Here are my reactions.

Pkt. 13 -- libcoap retries msgid 29549. A CoAP server should respond with an empty ACK if it can't send a piggybacked ACK with the full response in a timely manner. See 7252, sec 5.2.2. nanocoap (and gcoap) should do this, but don't. This also might help libcoap correlate the responses.

Pkt. 21 -- nanocoap ACKs msgid 29549 again. This is appropriate. 7252, sec 4.5: "The recipient SHOULD acknowledge each duplicate copy of a Confirmable message using the same Acknowledgement or Reset message but SHOULD process any request or response in the message only once."

Pkt. 22 -- libcoap sends msgid 29551 for block 5. This is wrong IMO. It has not received the ACK for 29550 yet. I wonder if it is misinterpreting Pkt. 21 by only matching on token (length is zero, so everything matches) rather than also verifying msgid. libcoap really should disregard Pkt. 21 and the other duplicate ACKs for 29549. See 7252, sec. 5.3.2. -- an ACK MUST match the message ID and token.

Ideas:

  1. Add a token to the requests from libcoap that varies with each block. One byte should be long enough. This may help libcoap match the ACKs from nanocoap. Unfortunately, the libcoap command line client doesn't seem to have an option to vary the token for each block.

  2. To save space in the PUT requests, add a "-U" parameter to the libcoap coap-client command. This will remove the Uri-Host option in the requests.

@fedepell
Copy link
Author

fedepell commented Nov 6, 2018

@kb2ma: Thanks for the detailed analysis!

I did a session now with high verbosity and your number 3 seems to be confirmed by the output here (my comments in spaced lines with [ ] ):


[ here is the first packet when RIOT side is not responding being sent first time, id = 0x736d
   / 29549 ]

v:1 t:CON c:PUT i:736d {} [ Uri-Host:fe80::5:d6ff:fe16:2131%tap0, Uri-Path:firmware, Block1:3/M/64 ]
 :: '\x00\x80\x00\x00\x08\x00\x00\x00\xB0\x09@\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00dJ\x18\x09\xC9\x8C\xCF\xE4V\x0A@\x00\x00\x00\x00\x00\x80\x02\x1A!\xFD\x7F\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
Nov 06 19:42:25 DEBG ** [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP tid=29549 added to retransmit queue (2938ms)

[ timeout so retransmit ]

Nov 06 19:42:28 DEBG ** [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683 
 UDP tid=29549: retransmission #1
Nov 06 19:42:28 DEBG *  [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP: sent 110 bytes
v:1 t:CON c:PUT i:736d {} [ Uri-Host:fe80::5:d6ff:fe16:2131%tap0, Uri-Path:firmware, Block1:3/M/64 ]
 :: '\x00\x80\x00\x00\x08\x00\x00\x00\xB0\x09@\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00dJ\x18\x09\xC9\x8C\xCF\xE4V\x0A@\x00\x00\x00\x00\x00\x80\x02\x1A!\xFD\x7F\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

[ timeout so retransmit ]

Nov 06 19:42:34 DEBG ** [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP tid=29549: retransmission #2
Nov 06 19:42:34 DEBG *  [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP: sent 110 bytes
v:1 t:CON c:PUT i:736d {} [ Uri-Host:fe80::5:d6ff:fe16:2131%tap0, Uri-Path:firmware, Block1:3/M/64 ]
 :: '\x00\x80\x00\x00\x08\x00\x00\x00\xB0\x09@\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00dJ\x18\x09\xC9\x8C\xCF\xE4V\x0A@\x00\x00\x00\x00\x00\x80\x02\x1A!\xFD\x7F\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'


[ now RIOT is back alive and sends a response, tid = 0x736d which is correct, the one before 
  and it gets removed on client side ]

Nov 06 19:42:45 DEBG *  [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP: received 7 bytes
v:1 t:ACK c:2.31 i:736d {} [ Block1:3/M/64 ]
Nov 06 19:42:45 DEBG ** [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP tid=29549: removed
Nov 06 19:42:45 DEBG *** EVENT: 0x2001
Nov 06 19:42:45 DEBG ** process incoming 2.31 response:
v:1 t:ACK c:2.31 i:736d {} [ Block1:3/M/64 ]
Nov 06 19:42:45 DEBG found Block1 option, block size is 2, block nr. 3

[ now the next block is sent, tid = 0x736e / 29550 ]

Nov 06 19:42:45 DEBG send block 4
v:1 t:CON c:PUT i:736e {} [ Uri-Host:fe80::5:d6ff:fe16:2131%tap0, Uri-Path:firmware, Block1:4/M/64 ]
 :: '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
Nov 06 19:42:45 DEBG *  [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP: sent 110 bytes
v:1 t:CON c:PUT i:736e {} [ Uri-Host:fe80::5:d6ff:fe16:2131%tap0, Uri-Path:firmware, Block1:4/M/64 ]
 :: '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
Nov 06 19:42:45 DEBG ** [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP tid=29550 added to retransmit queue (2156ms)

[ a response is received from RIOT but still for the old retrasmitted packed, id = 0x736d
  / 29549 but actually the new packet 29550 is removed from queue! ]

Nov 06 19:42:46 DEBG *  [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP: received 7 bytes
v:1 t:ACK c:2.31 i:736d {} [ Block1:3/M/64 ]
Nov 06 19:42:46 DEBG *** EVENT: 0x2001
Nov 06 19:42:46 DEBG ** [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP tid=29550: removed

[ one more ACK is received from RIOT, still id = 0x736d but nothing is done, queue is empty ]

Nov 06 19:42:46 DEBG ** process incoming 2.31 response:
v:1 t:ACK c:2.31 i:736d {} [ Block1:3/M/64 ]

[ now next block is being sent, id = 0x736f / 29551 ]

Nov 06 19:42:46 DEBG found Block1 option, block size is 2, block nr. 3
Nov 06 19:42:46 DEBG send block 5
v:1 t:CON c:PUT i:736f {} [ Uri-Host:fe80::5:d6ff:fe16:2131%tap0, Uri-Path:firmware, Block1:5/M/64 ]
 :: '\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
Nov 06 19:42:46 DEBG *  [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP: sent 110 bytes
v:1 t:CON c:PUT i:736f {} [ Uri-Host:fe80::5:d6ff:fe16:2131%tap0, Uri-Path:firmware, Block1:5/M/64 ]
 :: '\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
Nov 06 19:42:46 DEBG ** [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP tid=29551 added to retransmit queue (2813ms)

[ an ack is being received from RIOT side with id = 0x736d / 29549 , still the first ones from initial timeouts ]

Nov 06 19:42:46 DEBG *  [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP: received 7 bytes
v:1 t:ACK c:2.31 i:736d {} [ Block1:3/M/64 ]
Nov 06 19:42:46 DEBG *** EVENT: 0x2001

[ but coap-client matches it to the latest sent packet 29551 ]

Nov 06 19:42:46 DEBG ** [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP tid=29551: removed
Nov 06 19:42:46 DEBG ** process incoming 2.31 response:
v:1 t:ACK c:2.31 i:736d {} [ Block1:3/M/64 ]

[ and just a bit later the RIOT goes on with higher ids but they are still matched to wrong 
  new packs: ACK arrives for 0x736e but is matched to 29552 (0x7370)  ]

Nov 06 19:42:46 DEBG *  [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP: received 7 bytes
v:1 t:ACK c:2.31 i:736e {} [ Block1:4/M/64 ]
Nov 06 19:42:46 DEBG *** EVENT: 0x2001
Nov 06 19:42:46 DEBG ** [fe80::5c15:61ff:fe72:267e]:49284 <-> [fe80::5:d6ff:fe16:2131]:5683
 UDP tid=29552: removed


Hope things are readable with this ugly formatting!

@bergzand
Copy link

bergzand commented Nov 6, 2018

Let me catch up to the backlog here

Also pinging @bergzand for any ideas since he implemented block2.

This is all still using only the block1 code right? Did I miss something in reading this PR?

Somehow this rings a bell. @bergzand didn't we see this before?

Very familiar. @fedepell If I remember correctly (it's a few months since I worked on this) what can happen is that after the verification is done, a CoAP ACK is transmitted back to the coap-client. The coap handler starts processing the packet in the queue. This is again bytes 192-256, since one of the duplicates transmitted by coap-client is still in a buffer somewhere. This packet is of course discarded as a duplicated by the handler and a response is transmitted back to the client to continue. At this point a "bug" occurs in libcoap where this continue is mismatched as a response to the newly transmitted bytes 256-320. Somehow this causes the coap handler on the node to never receive this packet (I don't remember exactly what happens here), while the coap-client application continues to bytes 320-384.

Maybe libcoap is not looking at the block number in the response.

If my memory serves me right this is spot on!

@bergzand
Copy link

bergzand commented Nov 6, 2018

Maybe libcoap is not looking at the block number in the response.

If my memory serves me right this is spot on!

I think the RFC is also not completely clear on whether the client should match block numbers in requests and responses.

@fedepell
Copy link
Author

fedepell commented Nov 7, 2018

Just to wrap up how the problem of coap-client is generated:

  • There must be some timeouted packets so there are some extra ACKs on the wire for the retrasmitted packets
  • Some packet must be lost from client to server (RIOT)

When a packet is lost (step 2) it will not be retransmitted as the extra ACK from a previously retransmitted packet will be "used" to match the lost packet and the transfer will anyway continue without the retransmission.

Now the first point always happens since RIOT/OTA is not responding while it's doing the verification (so usually, at least on my SAML, 3-4 extra ACKs are generated). The second case happens very often when we use Ethos as there seems to be some other underlying problem there (my feeling is that when flash is being written the uart buffer either gets full or something even nastier).

Of course this case may happen in general, also over wireless or gprs or such (OTA indeed) UDP packets may be delayed and lost. With ethos now we just have a quite easy way to reproduce this delay+loss.

@bergzand
Copy link

bergzand commented Nov 7, 2018

Ethos as there seems to be some other underlying problem there (my feeling is that when flash is being written the uart buffer either gets full or something even nastier).

I've had some concurrency issues with ethos before (RIOT-OS#9890), but there the whole ethos quits functioning. Maybe you could try and check if that solves some of the stability issues for you.

Note that during the cryptographic operations, the whole coap server on the node is unresponsive, but packet are still received (since that happens in different threads)

@kb2ma
Copy link

kb2ma commented Nov 7, 2018

@fedepell, your wrap-up sounds right to me. I think there are two things to follow-up on with CoAP.

  1. Check with libcoap on why it processes the duplicate ACKs. Your pcap illustrates the problem pretty clearly.

  2. As an experiment, try hacking nanocoap to accommodate the delayed response for firmware verification. First, send an immedate ACK response. Then when verification completes, send a new message with code 2.0x with the same token but a new message ID. See the example on pg. 107 of 7252. Be sure nanocoap_server discards the response, which it will not be expecting.

  3. Implement separate responses like in (2), but rationally within the framework. I think gcoap actually is better suited to this kind of exchange because gcoap itself handles both requests and responses. I'm not sure how much effort this would take, but I'd be happy to take a crack at it if the OTA group thinks it's essential.

@kaspar030
Copy link

2\. First, send an immedate ACK response. Then when verification completes, send a new message with code 2.0x with the same token but a new message ID.

Well, we considered that. The problem is that without serious refcatoring of the code, this is gonna be difficult. From within the handler, nanocoap doesn't know it's peer...

@kb2ma
Copy link

kb2ma commented Nov 7, 2018

From within the handler, nanocoap doesn't know it's peer...

You're right, I had not considered that. gcoap has the same limitation because it uses the same coap_handler_t callback for request handling. As a hack/experiment, we could add a 'remote_client' sock_udp_ep_t* attribute to coap_pkt_t until we device a general solution.

For the general solution, it might make sense to update the coap_handler_t callback to accept a more structured context object rather than the current resource context void*. A structured context object (call it coap_response_memo_t) could include the current resource.context as well as the remote peer sock_udp_ep_t*, and whatever else we need to add in the future.

In the other direction, gcoap uses the gcoap_request_memo_t for a client to remember what it sent to the server. In fact, RIOT-OS#9857 plans to add gcoap_request_memo_t to the gcoap response handler callback.

@kYc0o
Copy link
Owner

kYc0o commented Nov 7, 2018

Thanks a lot @kb2ma @kaspar030 and @bergzand for your insights!

Let me know if I'm wrong, but for me (and for our current use case) the "problem" is that we pause for too long the block1 transfer and thus we provoke an undesired behaviour which leads to packet loss and sequence mismatching. Though I acknowledge the problems found, I think our first patch would be to process this first part of the binary (a.k.a metadata) separately from the firmware e.g. in two separate transfers. Anyways, it is how SUIT does and I think we'll follow that direction by default.

To summarize, how bad is to make a CoAP block1 transfer wait for a "long time" before it's considered a timeout? I know that in our case is not a timeout, since as @bergzand said the other threads continue to receive the packets, however we have the problems described here.

@fedepell
Copy link
Author

fedepell commented Nov 7, 2018

Just one question. @kb2ma said:

Pkt. 22 -- libcoap sends msgid 29551 for block 5. This is wrong IMO. It has not received the ACK for 29550 yet. I wonder if it is misinterpreting Pkt. 21 by only matching on token (length is zero, so everything matches) rather than also verifying msgid. libcoap really should disregard Pkt. 21 and the other duplicate ACKs for 29549. See 7252, sec. 5.3.2. -- an ACK MUST match the message ID and token.

If this was true, so coap-client would continue with the next id just when it received the previous one, then the whole process should work correctly, no?
The 3-4 extra ACK would be accepted and discarded, waiting for the correct ACK to arrive. Once arrived it would keep going correctly. And if a packet is lost it would get resent after timeout.

So my question is: isn't what kb2ma mention necessary? While I must admit I didn't read (yet) the RFC it sounds strange to me that we have some IDs if then nobody is matching them at all with an ACK.

@kb2ma
Copy link

kb2ma commented Nov 8, 2018

To summarize, how bad is to make a CoAP block1 transfer wait for a "long time" before it's considered a timeout?

In the pcap @fedepell sent, the ACK from nanocoap is stamped 21 seconds after the initial PUT from libcoap. During that time period libcoap has sent 3 retransmissions. IMO, it's a mistake to rely on the retransmit mechanism to recover for a delay of this length. The user expects a firmware update to be reliable, and this is flaky from the outset without even considering that the wireless link itself may be challenging.

I think our first patch would be to process this first part of the binary (a.k.a metadata) separately from the firmware e.g. in two separate transfers.

I agree that this is a good choice. Let's take the time to implement separate confirmable responses and understand what's happening with libcoap. Then we can recombine these two steps again into a single, reliable transfer.

@kb2ma
Copy link

kb2ma commented Nov 8, 2018

So my question is: isn't what kb2ma mention necessary?

I see two problems in the pcap you sent:

  1. libcoap appears to accept the duplicate ACKs for the retransmissions rather than disregard them.

  2. nanocoap does not send an immediate ACK and a separate response for the processing delay. That is the proper way to handle a delay of 21 seconds.

If either of these two shortcomings were addressed, the requests/responses should stay synchronized -- aside from the separate issue of RIOT not sending the ACK for one block (block 14 in the pcap you sent).

@fedepell
Copy link
Author

fedepell commented Nov 8, 2018

@kb2ma : Thanks a lot for the info!

So my question is: isn't what kb2ma mention necessary?

I see two problems in the pcap you sent:

1. libcoap appears to accept the duplicate ACKs for the retransmissions rather than disregard them.

2. nanocoap does not send an immediate ACK and a separate response for the processing delay. That is the proper way to handle a delay of 21 seconds.

If either of these two shortcomings were addressed, the requests/responses should stay synchronized -- aside from the separate issue of RIOT not sending the ACK for one block (block 14 in the pcap you sent).

I would say (generally speaking) both should be fixed as to make both (libcoap on one side and nanocoap on the other) theoretically interoperable with any other coap counterpart, no?

My question was actually pointing at this: fixing one would suffice at least for now to continue and it sounds to me that fixing libcoap could be easier than nanocoap. (maybe naive thinking :) )

As for block 14: that is the packet that RIOT never gets, due to data lost on the channel. That is what actually breaks the complete update (wireshark log being done PC side), if that wasn't missing it would just keep gluing blocks together correctly despite acks being out of sync.

There, if everything worked, the communication should have stopped, packet should have been resent and so on.

@kb2ma
Copy link

kb2ma commented Nov 8, 2018

I think our first patch would be to process this first part of the binary (a.k.a metadata) separately from the firmware e.g. in two separate transfers.

I agree that this is a good choice. Let's take the time to implement separate confirmable responses and understand what's happening with libcoap. Then we can recombine these two steps again into a single, reliable transfer.

Thinking about this a little more. Even if there are two separate transfers, isn't there still an issue with the processing delay? Certainly it would be better if this occurred while preparing the ACK for the last packet of the first transfer. I would still expect libcoap to retransmit the last packet though while waiting for the ACK.

[ just edited last sentence --libcoap, not nanocoap ]

@kb2ma
Copy link

kb2ma commented Nov 8, 2018

My question was actually pointing at this: fixing one would suffice at least for now to continue and it sounds to me that fixing libcoap could be easier than nanocoap. (maybe naive thinking :) )

Yes, I think that fixing either one would be sufficient. I don't know which would be easier. I suggest sharing your pcap with the libcoap project. My experience is that @obgm is responsive to good questions. Copy me, and I'll jump in if necessary. :-)

@fedepell
Copy link
Author

fedepell commented Nov 8, 2018

Done (obgm/libcoap#270) ! ;)

@fedepell
Copy link
Author

fedepell commented Nov 9, 2018

@kYc0o if you want to test with libcoap/coap-client from obgm/libcoap#271 it works much better now.

@kYc0o
Copy link
Owner

kYc0o commented Nov 15, 2018

I'll try to take a look this weekend.

@kYc0o
Copy link
Owner

kYc0o commented Dec 3, 2018

Sorry for the long wait, but was very busy lately.

Changes upstream were merged, so I'll try to test asap with current upstream master of libcoap.

@obgm
Copy link

obgm commented Dec 3, 2018

@kYc0o Be aware that the libcoap changes are still in the develop branch, not master.

@kYc0o
Copy link
Owner

kYc0o commented Dec 4, 2018

Oh! Thanks @obgm, any ETA for the change merged into master or next release?

@obgm
Copy link

obgm commented Dec 4, 2018

We are currently preparing 4.2.0-RC3, final release hopefully still in 2018.
Note that the new version 4.2.0 will bring a few API changes.

@kYc0o
Copy link
Owner

kYc0o commented Mar 23, 2019

As the base branch for this PR is quite outdated, I guess there's maybe no point to merge it.

Tests pass so far.

@fedepell are you still using something from here?

@fedepell
Copy link
Author

@kYc0o : no problem, we could discard this branch indeed. Once OTA gets implemented in master I could add this doc as I think it is very useful for fast testing (or for boards like my SAML21-XPRO without wireless/net) but for the time being there is not much to do I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants