-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xtensa ELF info/hints? #3
Comments
No, AFAIK. The best description I know of is in the binutils source: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/bfd-in2.h;h=ade49ffc6188210ad2d6484c154853eb6c75613e;hb=HEAD#l5359 and https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/elf32-xtensa.c;h=25236707dae46e7190c646de1601fb1f6ff088fc;hb=HEAD#l165
No, AFAIK. Can you give an example of such library, I'm curious how linking command looks for it?
I'm not surprised at all, but that reference doesn't explain much. esp-elf-rom is made to ease debugging with gdb. But from what you're saying it looks like you're developing dynamic loader, right?
-ENOPARSE. Can't find anything related by your link. |
Thanks. So, does R_XTENSA_ASM_EXPAND's (https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/elf32-xtensa.c;h=25236707dae46e7190c646de1601fb1f6ff088fc;hb=HEAD#l1965) purpose for example to only serve as a place of linker to check, not really change instruction's args? Also, is meaning of R_XTENSA_NONE "there was a relocation needed, but now it's done somehow" or "void entry, don't assume there was a relocation needed at all"? (See below for argumentation.)
This gives an example: http://stackoverflow.com/a/6570000/496009 . Again, only few archs support relocatable (vs PIC) shlibs, like i386.
Well, so I'm looking for a way to automatically tell which instruction operands are addresses and which are not. One way to do that is by using relocs. At the same time, I need the code to be linked already (all xref's resolved, and all addresses are in the code). That's done by applying relocations, and they're no longer needed after that and discarded. So, I was looking for a way to get both ;-). ld -r doesn't work as it explicitly produces an object, not executable file, and then 2nd idea was to cheat by producing shlib instead of executable. That doesn't appear work, so looks like I'll need to write a kind of linker ;-).
It was this: tommie/lx106-hal#1 (comment) |
Yes, it marks the places for link-time relaxation.
I think R_XTENSA_NONE should never appear in objects/executables. If it does it's most likely a bug.
Not sure I understand. The instruction defines how its operand is used, e.g. in l32r a0, x x is always an address. You probably care if the value loaded from x is an address, right? If so then I don't see why having PIC shared object is bad: addresses will anyway be represented as literals with relocations against them, and when you disassemble an instruction you'd be able to see that it refers to such literal. If for some other reason GOT and PLT need to be avoided it still may be easier to relax ld restrictions on relocation placement and allow leaving R_XTENSA_SLOT*_OP type relocations in the linked shared object. One of the reasons it's not allowed now is that these relocation types don't describe relocation completely, the instruction where relocation points must be analyzed in order to understand, how its immediate subfield must be changed. That'd be very expensive for dynamic linker, but doesn't matter for static analysis. |
Well, yeah, the beauty of the RISC. But that's not true in general case, e.g. if something is linked at address 0, N in "movi aX, N" can be either literal numeric value or address. For arch where "move immediate" is full-range, or for RISCs, which emulate it with l32r-like, the issue is also apparent.
In an object file produced by "ld -r"ing together all objects from exploded esp8266 sdk libs:
And generally, if those mark place which was already fixed up (e.g. SLOT0_OP which was undefined in a single object, but which was fixed up with relative addressing), it's better to have (for my usecase) at least NONE, than nothing at all.
It's not bad. The question was whether non-PIC objects can be put a shared lib: I just took an esp8266 which produces ELF (from which actual ROM image is to be extracted), and added --shared option, leading to bunch of errors quoted above, so I just wondered if something could be done about that, but I assume not. From Linux point of view, requiring shlib to be always PIC makes good sense, given that it simplifies dynamic linker and gives 100% sharable image w/o need for pages dirtied by relocations. Well, thanks for discussion, it was helpful, as I mentioned, I started writing kind of load-linker for scratchabit, even if it will be just proof of concept. |
Interesting. I looked at the produced object file and saw that
I still think that these are bugs. BTW, have you tried linker options
|
Great, exactly what I need! I tried to look thru ld --help, but apparently quit that too early switching to google instead. Thanks for the hint! |
Another question, not directly related to the above, but to not create another ticket: Reading Xtensa ISA RefMan, s.8.3.1:
Suppose I want to perform reverse transform - turn L32R into MOVI, but want to make it distinguishable from real MOVI - what naming would you suggest? So far I use "movi*", but maybe some form would be more "Xtensa-ic", e.g. "movi.l"? |
Make it distinguishable in what context? You mean disassembling l32r into movi? Don't know. To my taste literal disassembly with loaded value in comment is the best.
No, AFAIK: we only make opcode substitution at assembly time, not at disassembly. And if you write in assembly you usually just use movi regardless of the immediate value. |
Yes, in the context of producing human-readable disassembly (which is a context of ScratchABit mentioned above). You prefer that because you use Xtensa asm daily, for other people it's nuisance to remember difference between l32i & l32r ;-). Also, comments are just that - sequence of chars, while arguments are objects and have type (numeric value/address at least). So, in a current prototype of this feature for ida-xtensa I have argument vs comment the other way around:
So, if you don't have better suggestions than "movi*", let it stay that ;-). |
* cp-tree.h (build_min_nt_call_vec): Declare. * decl.c (build_offset_ref_call_from_tree): Call it. * parser.c (cp_parser_postfix_expression): Likewise. * pt.c (tsubst_copy_and_build): Likewise. * semantics.c (finish_call_expr): Likewise. * tree.c (build_min_nt_loc): Keep unresolved lookups. (build_min): Likewise. (build_min_non_dep): Likewise. (build_min_non_dep_call_vec): Likewise. (build_min_nt_call_vec): New. PR c++/80891 (#3) * g++.dg/lookup/pr80891-3.C: New. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@248571 138bc75d-0d04-0410-961f-82ee72b054a4
When -fcf-protection -mcet is used, I got FAIL: g++.dg/eh/sighandle.C (gdb) bt #0 _Unwind_RaiseException (exc=exc@entry=0x416ed0) at /export/gnu/import/git/sources/gcc/libgcc/unwind.inc:140 #1 0x00007ffff7d9936b in __cxxabiv1::__cxa_throw (obj=<optimized out>, tinfo=0x403dd0 <typeinfo for int@@CXXABI_1.3>, dest=0x0) at /export/gnu/import/git/sources/gcc/libstdc++-v3/libsupc++/eh_throw.cc:90 #2 0x0000000000401255 in sighandler (signo=11, si=0x7fffffffd6f8, uc=0x7fffffffd5c0) at /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/eh/sighandle.C:9 #3 <signal handler called> <<<< Signal frame which isn't on shadow stack #4 dosegv () at /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/eh/sighandle.C:14 #5 0x00000000004012e3 in main () at /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/eh/sighandle.C:30 (gdb) p frames $6 = 5 (gdb) frame count should be 4, not 5. This patch skips signal frames when unwinding shadow stack. gcc/testsuite/ PR libgcc/85334 * g++.dg/torture/pr85334.C: New test. libgcc/ PR libgcc/85334 * unwind-generic.h (_Unwind_Frames_Increment): New. * config/i386/shadow-stack-unwind.h (_Unwind_Frames_Increment): Likewise. * unwind.inc (_Unwind_RaiseException_Phase2): Increment frame count with _Unwind_Frames_Increment. (_Unwind_ForcedUnwind_Phase2): Likewise. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@259502 138bc75d-0d04-0410-961f-82ee72b054a4
Move pr83660.C to g++.target. As comment #3 of PR83660, rename it to c isn't one option. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr83660.C: Moved to... * g++.target/powerpc/pr83660.C: ...here.
This patch makes us avoid substituting into the TEMPLATE_PARM_CONSTRAINTS of each template parameter except as necessary for declaration matching, like we already do for the other constituent constraints of a declaration. This patch also improves the CA104 implementation of explicit specialization matching of a constrained function template inside a class template, by considering the function's combined constraints instead of just its trailing constraints. This allows us to correctly handle the first three explicit specializations in concepts-spec2.C below, but because we compare the constraints as a whole, it means we incorrectly accept the fourth explicit specialization which writes #3's constraints in a different way. For complete correctness here, determine_specialization should use tsubst_each_template_parm_constraints and template_parameter_heads_equivalent_p. PR c++/100374 gcc/cp/ChangeLog: * pt.cc (determine_specialization): Compare overall constraints not just the trailing constraints. (tsubst_each_template_parm_constraints): Define. (tsubst_friend_function): Use it. (tsubst_friend_class): Use it. (tsubst_template_parm): Don't substitute TEMPLATE_PARM_CONSTRAINTS. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-spec2.C: New test. * g++.dg/cpp2a/concepts-template-parm11.C: New test.
This is a regression present on the mainline and 12 branch at -O2, but the issue is related to vectorization so was present at -O3 in earlier versions. The vcondu expander that was added for VIS 3 more than a decade ago does not fully work, because it does not filter out the unsigned condition codes (the instruction is an UNSPEC that accepts only signed condition codes). While I was at it, I also added the missing vcond and vcondu expanders for the new comparison instructions that were added in VIS 4. gcc/ PR target/109140 * config/sparc/sparc.cc (sparc_expand_vcond): Call signed_condition on operand #3 to get the final condition code. Use std::swap. * config/sparc/sparc.md (vcondv8qiv8qi): New VIS 4 expander. (fucmp<gcond:code>8<P:mode>_vis): Move around. (fpcmpu<gcond:code><GCM:gcm_name><P:mode>_vis): Likewise. (vcondu<GCM:mode><GCM:mode>): New VIS 4 expander. gcc/testsuite/ * gcc.target/sparc/20230328-1.c: New test. * gcc.target/sparc/20230328-2.c: Likewise. * gcc.target/sparc/20230328-3.c: Likewise. * gcc.target/sparc/20230328-4.c: Likewise.
I noticed that for member class templates of a class template we were unnecessarily substituting both the template and its type. Avoiding that duplication speeds compilation of this silly testcase from ~12s to ~9s on my laptop. It's unlikely to make a difference on any real code, but the simplification is also nice. We still need to clear CLASSTYPE_USE_TEMPLATE on the partial instantiation of the template class, but it makes more sense to do that in tsubst_template_decl anyway. #define NC(X) \ template <class U> struct X##1; \ template <class U> struct X##2; \ template <class U> struct X##3; \ template <class U> struct X##4; \ template <class U> struct X##5; \ template <class U> struct X##6; #define NC2(X) NC(X##a) NC(X##b) NC(X##c) NC(X##d) NC(X##e) NC(X##f) #define NC3(X) NC2(X##A) NC2(X##B) NC2(X##C) NC2(X##D) NC2(X##E) template <int I> struct A { NC3(am) }; template <class...Ts> void sink(Ts...); template <int...Is> void g() { sink(A<Is>()...); } template <int I> void f() { g<__integer_pack(I)...>(); } int main() { f<1000>(); } gcc/cp/ChangeLog: * pt.cc (instantiate_class_template): Skip the RECORD_TYPE of a class template. (tsubst_template_decl): Clear CLASSTYPE_USE_TEMPLATE.
Hi, Richard and Richi. Base on the suggestions from Richard: https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625396.html This patch choose (1) approach that Richard provided, meaning: RVV implements cond_* optabs as expanders. RVV therefore supports both IFN_COND_ADD and IFN_COND_LEN_ADD. No dummy length arguments are needed at the gimple level. Such approach can make codes much cleaner and reasonable. Consider this following case: void foo (float * __restrict a, float * __restrict b, int * __restrict cond, int n) { for (int i = 0; i < n; i++) if (cond[i]) a[i] = b[i] + a[i]; } Output of RISC-V (32-bits) gcc (trunk) (Compiler #3) <source>:5:21: missed: couldn't vectorize loop <source>:5:21: missed: not vectorized: control flow in loop. ARM SVE: ... mask__27.10_51 = vect__4.9_49 != { 0, ... }; ... vec_mask_and_55 = loop_mask_49 & mask__27.10_51; ... vect__9.17_62 = .COND_ADD (vec_mask_and_55, vect__6.13_56, vect__8.16_60, vect__6.13_56); For RVV, we want IR as follows: ... _68 = .SELECT_VL (ivtmp_66, POLY_INT_CST [4, 4]); ... mask__27.10_51 = vect__4.9_49 != { 0, ... }; ... vect__9.17_60 = .COND_LEN_ADD (mask__27.10_51, vect__6.13_55, vect__8.16_59, vect__6.13_55, _68, 0); ... Both len and mask of COND_LEN_ADD are real not dummy. This patch has been fully tested in RISC-V port with supporting both COND_* and COND_LEN_*. And also, Bootstrap and Regression on X86 passed. OK for trunk? gcc/ChangeLog: * internal-fn.cc (get_len_internal_fn): New function. (DEF_INTERNAL_COND_FN): Ditto. (DEF_INTERNAL_SIGNED_COND_FN): Ditto. * internal-fn.h (get_len_internal_fn): Ditto. * tree-vect-stmts.cc (vectorizable_call): Add CALL auto-vectorization.
Here during overload resolution we have two strictly viable ambiguous candidates #1 and #2, and two non-strictly viable candidates #3 and #4 which we hold on to ever since r14-6522. These latter candidates have an empty second arg conversion since the first arg conversion was deemed bad, and this trips up joust when called on #3 and #4 which assumes all arg conversions are there. We can fix this by making joust robust to empty arg conversions, but in this situation we shouldn't need to compare #3 and #4 at all given that we have a strictly viable candidate. To that end, this patch makes tourney shortcut considering non-strictly viable candidates upon encountering ambiguity between two strictly viable candidates (taking advantage of the fact that the candidates list is sorted according to viability via splice_viable). PR c++/115239 gcc/cp/ChangeLog: * call.cc (tourney): Don't consider a non-strictly viable candidate as the champ if there was ambiguity between two strictly viable candidates. gcc/testsuite/ChangeLog: * g++.dg/overload/error7.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
Another support request regarding Xtensa stuff:
Thanks.
Context: well, if you make things like https://github.com/jcmvbkbc/esp-elf-rom yourself, you shouldn't be surprised someone else asks such questions ;-). And did a "@jcmvbkbc" in another project's ticket, so just leaving it here: https://github.com/pfalcon/ScratchABit
The text was updated successfully, but these errors were encountered: