Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

section .text will not fit in region iram1_0_seg #4551

Closed
4 of 5 tasks
dalbert2 opened this issue Mar 23, 2018 · 42 comments
Closed
4 of 5 tasks

section .text will not fit in region iram1_0_seg #4551

dalbert2 opened this issue Mar 23, 2018 · 42 comments
Assignees

Comments

@dalbert2
Copy link
Contributor

dalbert2 commented Mar 23, 2018

Basic Infos

  • This issue complies with the issue POLICY doc.
  • I have read the documentation at readthedocs and the issue is not addressed there.
  • I have tested that the issue is present in current master branch (aka latest git).
  • I have searched the issue tracker for a similar issue.
  • [ N/A ] If there is a stack dump, I have decoded it.
  • I have filled out all fields below.

Platform

  • Hardware: [ESP-12F]
  • Core Version: [2.4.1 Release]
  • Development Env: [Arduino IDE and VisualMicro (tried both)]
  • Operating System: [Windows]

Settings in IDE

  • Module: [Generic ESP8266 Module|Nodemcu|WifInfo (tried all 3)]
  • Flash Mode: [qio]
  • Flash Size: [4MB]
  • lwip Variant: [v2 Lower Memory]
  • Reset Method: [nodemcu]
  • Flash Frequency: [40Mhz]
  • CPU Frequency: [160MHz]
  • Upload Using: [SERIAL]
  • Upload Speed: [115200] (serial upload only)

Problem Description

Section full when building with 2.4.1 (works fine with 2.4.0)

Detailed problem description goes here.
Codebase that builds perfectly under 2.4.0 will not build with 2.4.1

Building 1.0.22 release using 2.4.0 core for WifInfo (ESP12 4M/1M):

Compiling 'Gateway2' for 'WifInfo'
Program size: 442,404 bytes (used 42% of a 1,044,464 byte maximum) (164.09 secs)
Minimum Memory Usage: 46580 bytes (57% of a 81920 byte maximum)

Building 1.0.22 release using 2.4.1 core for WifInfo (ESP12 4M/1M):

Compiling 'Gateway2' for 'WifInfo'
ld.exe: C:\Users\david\AppData\Local\Temp\VMBuilds\Gateway2\esp8266_wifinfo\Debug\Gateway2.ino.elf section .text will not fit in region iram1_0_seg
 

Error linking for board WifInfo
Build failed for project 'Gateway2'
collect2.exe*: error: ld returned 1 exit status

Building 1.0.22 release using 2.4.1 core for Generic ESP8266 module using Arduino IDE

Arduino: 1.8.5 (Windows 10), Board: "Generic ESP8266 Module, 160 MHz, nodemcu, 26 MHz, 40MHz, QIO, 4M (1M SPIFFS), 2, v2 Lower Memory, Disabled, None, Only Sketch, 115200"
...
Linking everything together...
"C:\Users\david\AppData\Local\Arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\1.20.0-26-gb404fb9-2/bin/xtensa-lx106-elf-gcc" -g -w -Os -nostdlib -Wl,--no-check-sections -u call_user_start -u _printf_float -u _scanf_float -Wl,-static "-LC:\Users\david\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.4.1/tools/sdk/lib" "-LC:\Users\david\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.4.1/tools/sdk/ld" "-LC:\Users\david\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.4.1/tools/sdk/libc/xtensa-lx106-elf/lib" "-Teagle.flash.4m1m.ld" -Wl,--gc-sections -Wl,-wrap,system_restart_local -Wl,-wrap,spi_flash_read  -o "C:\Users\david\AppData\Local\Temp\arduino_build_376519/Gateway2.ino.elf" -Wl,--start-group "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\gw2icon.c.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\DNSServer.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\Gateway2.ino.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\command.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\commands.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\config.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\crc16.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\crc32.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\filesystem.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\fw_version.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\fwupdate.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\led.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\neighbors.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\network.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\protocol.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\radio.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\report.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\semtech.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\sensor.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\switch.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\telnet.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\web_css.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\web_html.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\web_js.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\webserver.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\src\NtpClient\NTPClientLib.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\src\Time\DateStrings.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\sketch\src\Time\Time.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\EEPROM\EEPROM.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WiFi\ESP8266WiFi.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WiFi\ESP8266WiFiAP.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WiFi\ESP8266WiFiGeneric.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WiFi\ESP8266WiFiMulti.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WiFi\ESP8266WiFiSTA.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WiFi\ESP8266WiFiScan.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WiFi\WiFiClient.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WiFi\WiFiClientSecure.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WiFi\WiFiServer.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WiFi\WiFiServerSecure.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WiFi\WiFiUdp.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\SPI\SPI.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266HTTPClient\ESP8266HTTPClient.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266httpUpdate\ESP8266httpUpdate.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\Ticker\Ticker.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266mDNS\ESP8266mDNS.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266SSDP\ESP8266SSDP.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WebServer\ESP8266WebServer.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WebServer\ESP8266WebServerSecure.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WebServer\Parsing.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519\libraries\ESP8266WebServer\detail\mimetable.cpp.o" "C:\Users\david\AppData\Local\Temp\arduino_build_376519/arduino.ar" -lhal -lphy -lpp -lnet80211 -llwip2 -lwpa -lcrypto -lmain -lwps -laxtls -lespnow -lsmartconfig -lairkiss -lwpa2 -lstdc++ -lm -lc -lgcc -Wl,--end-group  "-LC:\Users\david\AppData\Local\Temp\arduino_build_376519"
c:/users/david/appdata/local/arduino15/packages/esp8266/tools/xtensa-lx106-elf-gcc/1.20.0-26-gb404fb9-2/bin/../lib/gcc/xtensa-lx106-elf/4.8.2/../../../../xtensa-lx106-elf/bin/ld.exe: C:\Users\david\AppData\Local\Temp\arduino_build_376519/Gateway2.ino.elf section `.text' will not fit in region `iram1_0_seg'

collect2.exe: error: ld returned 1 exit status

Using library EEPROM at version 1.0 in folder: C:\Users\david\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.4.1\libraries\EEPROM
Using library ESP8266WiFi at version 1.0 in folder: C:\Users\david\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.4.1\libraries\ESP8266WiFi
Using library SPI at version 1.0 in folder: C:\Users\david\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.4.1\libraries\SPI
Using library ESP8266HTTPClient at version 1.1 in folder: C:\Users\david\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.4.1\libraries\ESP8266HTTPClient
Using library ESP8266httpUpdate at version 1.1 in folder: C:\Users\david\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.4.1\libraries\ESP8266httpUpdate
Using library Ticker at version 1.0 in folder: C:\Users\david\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.4.1\libraries\Ticker
Using library ESP8266mDNS in folder: C:\Users\david\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.4.1\libraries\ESP8266mDNS (legacy)
Using library ESP8266SSDP in folder: C:\Users\david\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.4.1\libraries\ESP8266SSDP (legacy)
Using library ESP8266WebServer at version 1.0 in folder: C:\Users\david\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.4.1\libraries\ESP8266WebServer
exit status 1
Error compiling for board Generic ESP8266 Module.

MCVE Sketch

It's a big proprietary codebase, I'm sorry, I can't post it.

Debug Messages

Debug messages go here
@Pablo2048
Copy link

Pablo2048 commented Mar 23, 2018

Well, you can't post the code so how do you think one can help you? Do you use F(), PSTR, ICACHE_FLASH_ATTR...?

@dalbert2
Copy link
Contributor Author

dalbert2 commented Mar 23, 2018

Pablo, I appreciate the issue with posting defects without code; unfortunately, this issue is with the final link step. There are many libraries and tools where addressing problems does not require public sharing of a complete source code base. Were this required to debug popular toolchains, they couldn't be used for anything closed-source.

I posted the issue for two reasons: 1) it seems like a serious problem with the 2.4.1 release and 2) sometimes the issue is obvious to an experienced developer who may be able to point me in the right direction quickly (I've already spent hours on this).

I am very experienced with bare-metal ARM/Cortex development using the gnu toolchain, but am relatively new to the xtensa tools and so far have been blissfully isolated from them thanks to the delightful Arduino/ESP8266 Core IDE/tools you have so graciously made possible. I'm going through the linker scripts and build products now to figure out what's going on and will post more on this as I dig deeper.

@Pablo2048
Copy link

@dalbert2 you have to test the issue with git version first (IMO this has been suggested many times by developers). IMO no one can tell you what is wrong with near to zero informations about your project. To be more specific - did you check that you use ICACHE_FLASH_ATTR decorators where possible as i suggest? That's all i can say...

@igrr
Copy link
Member

igrr commented Mar 23, 2018

I think we have recently moved some stuff (vtables?) into IRAM to free up some heap space. If your application used almost all available IRAM in the previous version it may fail to link with the new one. You can check this by running xtensa-lx106-elf-objdump -h your_sketch.elf with the old (2.4.0) version, and see how much of the IRAM space was used.

@dalbert2
Copy link
Contributor Author

Thank you both for your help!

Pablo, yes I make extensive use if ICACHE_FLASH_ATTR for all code except functions that will be called from an interrupt context. I also make extensive use of PROGMEM for literals.

Ivan, thank you for jumping in! My .text section (built with 2.4.0) is 0x7f43 bytes, so yes, it's on the edge and apparently the change to 2.4.1 pushed it over the limit. The majority of my code (0x60a2d bytes) is in .irom0.text located in SPI flash, but it looks like I will need to find a way to reclaim some instruction RAM.

A request: the build output reports "Memory Usage" which I had assumed to be RAM, but I now understand is only data RAM (.data+.rodata+.bss). It also reports "Program Size" which is SPI Flash usage (.irom0.text+.text+.rodata+?). However, it doesn't let you know how close you are to the 0x8000 (32K) limit of instruction RAM which is actually the tightest memory constraint on the platform. It would be very nice if the build output included % IRAM used. This would help many Arduino users who won't be familiar with code segmentation at all and it will also help experienced embedded developers who are not familiar with the split regions for instruction and data RAM (since your very nice Arduino environment insulates us from all of that). For example, I knew some of my code would be located in RAM, but I didn't realize that instruction RAM was separate from data RAM, so when I saw I had plenty of free RAM, I thought I was OK.

Thank you both again for the quick responses and for helping me understand the issue!

For my reference and for anyone who comes across this later:
https://github.com/esp8266/esp8266-wiki/wiki/Memory-Map

@Adam5Wu
Copy link
Contributor

Adam5Wu commented Mar 23, 2018

I have experienced similar issue recently - if I enable debug serial I would get the error message.

Like @igrr suspected, my issue was related to the vtable placement.
It seems the base SDK + Arduino already consumed around 7300h iram;
And vtables in my code pushed the use to around 7fc0h;
There is not enough room for debug serial code anymore.

I had to remove some virtual destructors in the base classes to reduce vtable size.

@dalbert2
Copy link
Contributor Author

@Adam5Wu I haven't had a chance to check yet, but if the base SDK + Arduino pushes IRAM to 0x7300 bytes, that is a very serious problem; it means there's less a smidge more than 3K instruction RAM left for the user.

@devyte
Copy link
Collaborator

devyte commented Mar 23, 2018

Hi @dalbert2 , I still don't have plans to travel to Maryland, so that beer is still pending :)

One PR that could be relevant for you is #4384 , although I somehow doubt it will be enough.
As explained above, several things have been moved out of heap into iram, which makes things tighter. I'm considering whether to leave this issue open to track related effort, but a quick look at the current code declared with ICACHE_RAM_ATTR tells me there isn't that much that can really be moved here in the core. I'm going to take another overview over the weekend, and decide then whether to keep this open or not.

Are you using your own ISRs? If you are, and they contain significant amount of code, I have an idea that may help you out. Let me know if that's the case, and we can discuss details. Also, if it's viable, I could take a private look at your code, and provide some feedback as to what you could do, as I have done a couple of times for others.

@dalbert2
Copy link
Contributor Author

Hi @devyte, the beer will be waiting when you have time to visit!

Thank you for the pointer to #4384 If the cont_* functions haven't been moved to SPI Flash in 2.4.1, I'll try making a build using git head (the instructions for that don't seem to work but I'll figure it out) or I will make the changes to 2.4.1 locally; thanks for the suggestion and for finding more IRAM space to reclaim!

My code has several ISRs; the application is a wireless gateway (proprietary telemetry network to WiFi) so there are ISRs to respond to radio events and a couple of front-panel push-buttons. My understanding is that any code that runs in the interrupt context must be located in IRAM. Now that I understand the problem (extremely limited space for ISR code), I can likely fix by reducing ISR functionality and shifting more of the work to the event loop. If I run into trouble, I'll happily take you up on your kind offer and if you have other approaches for freeing up IRAM, I'd love to hear them.

I still haven't checked base free IRAM, but if it really is ~3K as Adam reported, I think it's a significant issue that merits further examination. Because the ESP8266/Arduino is an embedded platform, anyone using it for non-trivial applications will likely have some ISR code. At the very least, I think the need for ICACHE_FLASH_ATTR and ICACHE_RAM_ATTR decorators should be front and center in the documentation, covered at least as well as the PROGMEM documentation since this is not an obvious thing for developers new to this platform and likely bites everyone as soon as they start writing non-trivial programs. I am not part of this project, but will happily contribute the documentation if permitted.

BTW, with respect to your comment in #797, IMO any developer who is writing interrupt handlers should be able to do so without the need for garbage collection and we'd benefit far more from better interrupt handling behavior and more available instruction RAM. Thanks again!

@Eszartek
Copy link

Eszartek commented Mar 24, 2018

I hit against this error right after 2.4.1 was committed (I normally sync to HEAD every few days), so I just rolled back to 2.4.0, expecting this just to be some minor commit typo that would be fixed soon. I use no extra ISR's (but I was planning to soon), and I try hard to not to use iram for variables, so this change in iram usage may be a bigger issue for me than I first thought.

Does anyone have a list of advanced best practices to reduce iram usage other than the typical PROGMEM/PSTR()/F() wrapping?

My .text usage under 2.4.0 is 0x7480, not quite as tight as dalbert2's project, but clearly pushing the limits.

Backing out the vtables change from #4179 itself doesn't help , and all was fine leading up to 2.4.1 anyway, so I'm guessing something else is using more iram in 2.4.1 as well, but I can't pinpoint it yet.

It is commit: 170911a that triggers the blowout of iram for me. I'll look into why that is.

On the previous commit 0643d6e .text usage is at 0x74f8. So, 2.4.1 works for me, I got confused until I tested further.

@Eszartek
Copy link

Eszartek commented Mar 24, 2018

So far, my first guess is that now that gdb_hooks.h is included explicitly in core_esp8266_main.cpp, it is pulling in extra code bits that use iram, even though it is supposed to be just a stub when not using gdb.

Update: I checked out HEAD, reverted core_esp8266_main.cpp to the version from the last working commit 0643d6e and my project builds fine. Nope, I messed up again, still working on it.

@devyte
Copy link
Collaborator

devyte commented Mar 24, 2018

@dalbert2 Contributions are always welcome. If you propose a change, I'll take a look.
If you don't mind a bit of additional overhead in calling the code inside the ISR, one idea currently on the table is to schedule a function from the ISRs. The scheduled function then gets called once as though it were in the loop.
HOWEVER: currently there are two caveats with this:

  1. the function that does the scheduling is currently not declared for IRAM (see Schedule.h)
  2. the scheduled functions currently get called after the next loop. It is being discussed moving that to before the next loop => latency in reaching your code

The idea is that you move out all your ISR code out of IRAM, and move that single scheduling function into IRAM.

There's a PR that sort of goes along these lines, but for the Ticker instead of hw ISRs: #4209 . It can serve to explain the idea.

@hreintke
Copy link
Contributor

@devyte I have also been thinking/working on de scheduled_interrupt addition to functionalinterrupt (#2745)
Can finish/clean that and create a PR for that.

@devyte
Copy link
Collaborator

devyte commented Mar 25, 2018

@hreintke That sounds interesting, and I've been thinking of a rewrite of the scheduled function code in C++.
BTW, #2738 is also yours. Is that superseded by #4209 ?
And about functionalInterrupt, there's a mem leak in the current implementation. Can you think of a way to fix it? When I looked at it, I didn't see an obvious solution.
But let's discuss that elsewhere.

@Adam5Wu
Copy link
Contributor

Adam5Wu commented Mar 26, 2018

I hit the IRAM wall again when I am trying to make a moderate complexity project, with ArduinoJson, AsyncWebServer, AsyncMQTT, vFATFS.

The vtable size is 1750h and there no way it could fit into IRAM in the near term, even with all the improvement efforts mentioned in above discussions...

Is there an easy way to make vtable location configurable? I am not very familiar with the linker part of workflow, is it possible to have #ifdef in the eagle.app.v6.common.ld?

@d-a-v d-a-v self-assigned this Mar 26, 2018
@d-a-v
Copy link
Collaborator

d-a-v commented Mar 26, 2018

@Adam5Wu I asked for the same a while ago, and It has been discussed (on gitter, @igrr suggesting using cpp just like you propose).
This is doable with an option in the boards generator.

@d-a-v
Copy link
Collaborator

d-a-v commented Mar 26, 2018

@Adam5Wu please check #4567

@earlephilhower
Copy link
Collaborator

earlephilhower commented Mar 26, 2018

Maybe it's worth considering removing the move to IRAM for vtables completely.

When I proposed it I was thinking of putting vtables completely into flash (which would avoid all these issues and not take any add'l flash space over the way before where they were in BSS). But that would make accessing a virtual class function impossible during interrupt processing. So it ended up in IRAM, which for my own use cases was pretty empty, but is obviously not so as the SDK updates and on larger, class-based projects.

However, that's one heck of a vtable size...at 6KB it'll eat up over 15% of total available RAM!

The alternative, if deferred functions are working, would be to put VTABLES in flash (change the linker entry position) and require folks to use deferred funcs for processing...

@Adam5Wu
Copy link
Contributor

Adam5Wu commented Mar 26, 2018

@d-a-v thanks a lot, that certainly makes life easier switching between modes!

@earlephilhower I admit I am a bit of OO maniac... some of my classes use dual inheritance so the vtable tends to grow :D
... but this is exactly my excitment point about esp8266 arduino -- all the power of OO enables putting more sophisticated logic easily onto such as tiny package.

I think vtable in IRAM still have lots of value, with so little memory, every bit of saving counts!

@Eszartek
Copy link

@d-a-v, @earlephilhower If the "move vtables" had a 3rd option "flash", then folks like me who mostly code in c instead of c++ could see an increase in both free iram and free heap?

@earlephilhower
Copy link
Collaborator

earlephilhower commented Mar 26, 2018

@Eszartek yes, it would increase iram and heap as the constant vtable would only ever be present in one spot (flash). You're under the usual restriction of not using classes w/vtable in IRQs or while doing SPI work (i.e. anywhere your code needs to be in IRAM to work).

If this was extended, it's trivial to add an option to move __FILE__(or is it __function__ which is in RAM) macro definitions into flash only as well when using assert()s. The current postmortem.c will safely handle these when printing assertion failure notifications.

I didn't try adding that to the earlier PRs because if you use that macro in your own code it'll crash w/o pgm_read_byte() mediated accesses.

@d-a-v
Copy link
Collaborator

d-a-v commented Mar 26, 2018

@Eszartek I will leave the Makefile flash option to the professionals. I was only the Makefile initiator here, because I sometimes do the operation by hand too and it costs the same price.
I know for sure it is far easier to modify an already existing code than doing the full job. This is my ldscripts contribution (they still are quite obscur to me, until today I did not realize order mattered in it).

@Eszartek
Copy link

@d-a-v I find it interesting to read all this discussion, that each vtable storage scenario (heap, iram, flash) has value depending on the project design. Up until now, I have only been concerned with minimizing heap usage and stuffing as much into flash as possible. I never imagined there were more gotchas lurking around the corner. Cool info!

@dalbert2
Copy link
Contributor Author

@earlephilhower I appreciate the insights and ideas. I'd like to respectfully suggest that until a flexible option for placement of vtables is available, the development team consider rolling back the move of vtables into IRAM or move them to flash. Moving vtables to IRAM has broken many medium-sized applications and may break the entire system depending on how the core develops.

I believe it would be healthier for the project to maintain backward compatibility. The restriction of not using virtual functions for interrupt service routines is far less damaging than not being able to write non-trivial interrupt service routines at all (these are embedded systems after all).

Thanks again to all the devs for all of their hard work on this, and I look forward to your decision!

@devyte
Copy link
Collaborator

devyte commented Mar 27, 2018

6K in vtables... Wow. @Adam5Wu have you considered a template based design instead of a polymorphic one?

@earlephilhower
Copy link
Collaborator

@dalbert2, all:

@d-a-v has just pushed a makefile-based way to select iram or heap (the traditional way) for the tables. I'm going to add a mode to make it all in flash, as well, and in fact I think that's the best spot for them. IRQs and vtables are a bad mix IMO, and relatively uncommon, so flash as default a) saved heap and iram, and b) will work for the majority of folks. With people doing serious work where they need large vtables and access in IRQs, they can use either heap or iram via the makefile.

I think that'd cover things. Backwards compatible for the vast majority of folks, even with medium to large projects (and giving them extra RAM to work with), but customizable for advanced users. OOM errors are probably the biggest issue on ESP8266, so I'd like to help solve them by default...

@dalbert2
Copy link
Contributor Author

You guys are awesome...thank you so much!

@Eszartek
Copy link

Eszartek commented Mar 27, 2018

@d-a-v @earlephilhower The new Makefile to choose heap/iram works perfect for me. I can now compile my project from the tip of master.

Since now I'm all curious about iram usage, I wanted to see how much usage has changed since 2.4.1:

Compiled with master with vtables in heap

Idx Name          Size      VMA       LMA       File off  Algn
  0 .data         000015dc  3ffe8000  3ffe8000  000000e0  2**4
                  CONTENTS, ALLOC, LOAD, DATA
  1 .rodata       00001ae8  3ffe95e0  3ffe95e0  000016c0  2**4
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          0000c428  3ffeb0c8  3ffeb0c8  000031a8  2**4
                  ALLOC
  3 .irom0.text   00051d00  40201010  40201010  0000aff0  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  4 .text         00007e48  40100000  40100000  000031a8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE

Compiled with 2.4.1

Idx Name          Size      VMA       LMA       File off  Algn
  0 .data         000015d4  3ffe8000  3ffe8000  000000e0  2**4
                  CONTENTS, ALLOC, LOAD, DATA
  1 .irom0.text   000515d0  40201010  40201010  0000a210  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .text         000074f0  40100000  40100000  00002d20  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  3 .rodata       0000165c  3ffe95e0  3ffe95e0  000016c0  2**4
                  CONTENTS, ALLOC, LOAD, DATA
  4 .bss          0000c2b8  3ffeac40  3ffeac40  00002d20  2**4
                  ALLOC

So that is (7e48-74f0) = 958h or 2392 bytes more iram usage now vs 2.4.1. If this is the correct way to look at the usage, does the handful of commits since 2.4.1 really need to use this much more iram? Or, is this not a concern for me and I should stop dwelling on it?

Update: Here is the results of vtables in flash:

Idx Name          Size      VMA       LMA       File off  Algn
  0 .data         000015dc  3ffe8000  3ffe8000  000000e0  2**4
                  CONTENTS, ALLOC, LOAD, DATA
  1 .irom0.text   00052120  40201010  40201010  0000abd0  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .text         00007e48  40100000  40100000  00002d7c  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  3 .rodata       000016bc  3ffe95e0  3ffe95e0  000016c0  2**4
                  CONTENTS, ALLOC, LOAD, DATA
  4 .bss          0000c430  3ffeaca0  3ffeaca0  00002d80  2**4
                  ALLOC

.text section is the same as vtables in heap, but flash usage (.irom0.text) bumped up appropriately. Thank you @d-a-v & @earlephilhower for your work on this!

@dalbert2
Copy link
Contributor Author

Am I reading this correctly, even with the Makefile changes, there is virtually no IRAM available (only 440 bytes)? Having no ability to write ISRs seems like a critical problem for an embedded platform; what are the changes that consumed so much IRAM and are they really worthwhile?

@earlephilhower
Copy link
Collaborator

Check the .MAP output on git HEAD and look at the .text section. On the Blink example, I see about 0x6000 bytes taken by the SDK libs.

LWIP2 looks like ~0x0500 bytes.

On Arduino code, the sizes looks like below:

 .iram.text     0x0000000040106750       0xe7 /tmp/arduino_build_696987/arduino.ar(core_esp8266_wiring.c.o)
 .iram.text     0x0000000040106838      0x2a9 /tmp/arduino_build_696987/arduino.ar(core_esp8266_wiring_digital.c.o)
 .iram.text     0x0000000040106ae4      0x133 /tmp/arduino_build_696987/arduino.ar(core_esp8266_wiring_pwm.c.o)
 .iram.text     0x0000000040106c18       0x87 /tmp/arduino_build_696987/arduino.ar(cont_util.c.o)
 .iram.text     0x0000000040106ca0      0x14b /tmp/arduino_build_696987/arduino.ar(core_esp8266_timer.c.o)
 .iram.text     0x0000000040106dec       0x5d /tmp/arduino_build_696987/arduino.ar(heap.c.o)
 .iram.text     0x0000000040106e4c       0x4a /tmp/arduino_build_696987/arduino.ar(core_esp8266_phy.c.o)
 .iram.text     0x0000000040106e98       0xc3 /tmp/arduino_build_696987/arduino.ar(libc_replacements.c.o)
 .iram.text     0x0000000040106f5c       0x17 /tmp/arduino_build_696987/arduino.ar(sntp-lwip2.c.o)
 .iram.text     0x0000000040106f73      0x10d /tmp/arduino_build_696987/arduino.ar(core_esp8266_postmortem.c.o)
 .iram.text     0x0000000040107080        0x4 /tmp/arduino_build_696987/arduino.ar(gdb_hooks.c.o)

I think the libc replacements should probably able to be moved out (if you're calling _fopen_r in an interrupt, you are so far from correct it's not funny....)

Some routines in core_wiring_digital might be able to be moved, (do you need to be able to attach or detach an interrupt, during an interrupt?) but a lot of the code needs to be there since you use IRQs for doing digitalWrites() to do PWM or stepper control or whatever. Same for cont_util (after the single init call, the cont_init fcn should never be used again....and even so, does it need to be in ram?)

If there was a way to guarantee the cache is in a usable state, about 2/3 of the postmortem could be punted to flash, I think.

@steminabox
Copy link

good suggesting with libc - I took them all out and got an extra 192 bytes...
in core_wiring_digital do you recon __attachInterruptArg __attachInterrupt and __detachInterrupt ?

The problem with running around doing these things is that unless you know everywhere in the library that they are used, you might be causing problems that will bite you in the backside weeks later...

But any other suggestions?

And it's interesting to note that the users on the Espressif are complaining because the SDk is itself become bloatware...

As I said on the other post, having over 93% of a critical resource used before the user starts to write a program is just silly.. This IRAM issue looks to be the achilles heal of the 8266, as soon as you try and write a non trivial program on the 8266 you hit that 32K limit...

Every bit that come out of the arudnio library needs to come out, even if that slightly limits some things (like not being able to call _fopen from an interupt!).

@steminabox
Copy link

should I start a new post with all the things I find to delete? I just trying them and seeing what happens :-) (almost random monkey approach)... ie

Why would I need - Tone.c? Gone... hexdump? (not interested) gone.

in postmortem - also something I'm not interested in - how is that ending up in IRAM? In fact, I may as well pull all the gdb stuff - that's an example of what I mean by bloat as you don't want any of it in your final code, and most people wouldn't use it in developing either.. So it should all be #defined out unless someone wants to use it.. (Of course, it may be and I just haven't seen it yet! ie until today I've never bothered with what inside the library .. :-) ).

I'm looking for anything with ICACHE before it that can be got rid of... :-)

@devyte
Copy link
Collaborator

devyte commented Apr 1, 2018

@steminabox Your approach is flawed. If you remove.something like tone.c, and your app still builds, then you have gained nothing, because the TU wasn't being included in your build anyways. You seem to be forgetting about some build optimizations.
What you should do is measure the IRAM use in a sketch that is a similar app as your final one, then start removing things one by one, and see the effect of IRAM use in the built binary. If you didn't gain anything, don't remove it.

About postmortem, that's a safety net. If your app crashes, or runs oom, then you want to know why and where. If you remove that, then your app will crash and not give you any useful insight as to why. You have the following choices:

  1. You can develop painfully without it. Gl if you do this, because if you encounter a core issue, you'll be on your own, given that you won't be able to open a policy-compliant issue in this tracker, and will therefore get no help.
  2. You can develop without it, and if you encounter an issue, you can try to put it back in to get insight into the crash. This doesn't make much sense, because if you need postmortem out for your app to work, then you'll have trouble putting it back in, and if you do get it back in and your app works, why take it out at all?
  3. Don't take it out, leave it in for the same reason all other developers do.

About gdb stuff, I'm not sure if there is any overhead if it is not actively inited, I.e.: I suspect the TU should get left out, same as in other cases, but I'm not sure.

Instead of shooting in the dark, how about you implement an idea that just came up? Put all ISRs in their own files, to make sure they don't get built into apps that don't use them. If you're interested, look me up in our gitter channel, and I can guide you on how to proceed.

@steminabox
Copy link

yep, I got that on tone.c - I didn't know how much optimization there was down stream (till I compiled an saw that it didn't make a difference).. But removing it wasn't going to be a problem either way..
On postmortem - I don't think I've ever asked for help with a crash, or look at any dump. If I get it crashing - fairly rare - it normally isn't that hard to work out why.
(and I did say I was doing almost random monkey :-) )
I really don't want to touch the core code, as that makes updates a pain... So I'm doing this hack now to make what I'm currently doing fit, then either the official library is going to be unbloated (in terms of IRAM) - which may also be problematic given the Espressif is also bloating - or I'll move onto the esp32 (I would have done that by now apart from the fact it seems to be not 5v tolerant - where the 8266 is 5v tolerant (not supply voltage, logic) ) .
While the used IRAM on the ardunio esp32 is a bit more than on the ardunio 8266, it doesn't matter any where as much as it HAS a lot more...

Back to your idea - I see what you mean with the ISR into separate files, as I assume (like tone) they will be optimized out of the final build by the linker if they haven't been referenced. Why doesn't the official build do this ie have a sub director 'isr' and put them all in there?

@Eszartek
Copy link

Eszartek commented Apr 1, 2018

Although all is well for me at the moment with vtables in heap or flash as per @earlephilhower 's pr, commit 170911a still has me worried, since it was the commit that triggered the iram exhaustion for my app and it really appears to be including gdb bits that were never included before. Prior to this commit you had to explicitly #include <GDBStub.h> file, now its included in postmortem.

On the other hand, It could be that it tipped iram usage by a few bytes and I was right on the line, so if it wasn't that commit, it would have been some other that would have become the trigger.

Update: 108 byte increase. Nothing crazy, but enough to exceed my allowance of iram.

I admit though, in this repo, the new commits that excite me the most are the ones that say "saved xxx bytes heap/ram by doing xyz". I drop whatever I'm doing at that moment to get my handout of free bytes :)

@dalbert2
Copy link
Contributor Author

dalbert2 commented Apr 4, 2018

Could someone post a for-dummys on how to build using the version on Git head? The instructions here seem to be out of date. Thanks!

@Eszartek
Copy link

Eszartek commented Apr 4, 2018

It works for me. I made a shell script for my convenience:

#!/bin/bash

cd hardware/esp8266com
rm -rf esp8266
#git clone -b "2.4.1"  --single-branch https://github.com/esp8266/Arduino.git esp8266
git clone https://github.com/esp8266/Arduino.git esp8266
(cd esp8266/tools; ./get.py)

@d-a-v
Copy link
Collaborator

d-a-v commented Apr 4, 2018

@dalbert2 These instructions are up to date and working.
Please don't spoil an active issue with unrelated comment,
prefer opening a new one for your particular problem with useful input that would help us help you.

@dalbert2
Copy link
Contributor Author

dalbert2 commented Apr 4, 2018

Datapoints: an empty sketch built with
2.4.0: .data=0x4fc, .bss=0x7260, .text=0x6fcf (4145 bytes free)
2.4.1: .data=0x4fc, .bss=0x7290, .text=0x6fe0 (4128 bytes free)
ESP8266_wps_example sketch:
2.4.1: data=0x51c, .bss=0x73a8, .text=0x70c4

So at first glance, it looks like core/sdk IRAM didn't change IRAM usage much between versions. However, even when I reworked my application to reduce IRAM usage, when I build with:
2.4.0: data=0x99c, .bss=0x8008, .text=0x7c63
2.4.1: iram1_0_seg overflows

So it appears that something causes at least 925 bytes more IRAM usage with 2.4.1 but not when using 2.4.0

@dalbert2
Copy link
Contributor Author

dalbert2 commented Apr 10, 2018

@d-a-v I will open another ticket and will provide relevant details (including the partial fix which came from #4464).

I posted it here because until a fix for this issue is folded into a release, anyone struggling with the out-of-IRAM problem may want to pull from git head.

The partial fix for building from git is to install Python 3.x (not 2.7x as per the instructions or it will fail during the get.py step (at least on Windows and Mac platforms). The step that is needed for dummies (i.e. me) that's not included in the instructions is how to get the downloaded 2.5.0-dev package to show up in the boards manager. I'll update this post when I figure that out.

@d-a-v
Copy link
Collaborator

d-a-v commented Sep 4, 2018

vtable location selector is in menu in core-2.4.2.
About heap we have #3740
Closing.

@d-a-v d-a-v closed this as completed Sep 4, 2018
@dalbert2
Copy link
Contributor Author

dalbert2 commented Sep 5, 2018

Confirmed, 2.4.2 fixes the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants