Skip to content

Commit

Permalink
[misc] various notes
Browse files Browse the repository at this point in the history
  • Loading branch information
Gary Benson committed Nov 16, 2015
1 parent 1d44942 commit a6f7459
Show file tree
Hide file tree
Showing 6 changed files with 242 additions and 0 deletions.
44 changes: 44 additions & 0 deletions FUNCTIONS
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
td_ta_map_lwp2thr:
- always needs to call ps_getpid (currently passed as an argument)
- needs to access inferior memory (done)
- generally needs to call ps_get_thread_area (SYSTEM FUNC CALLS)
- returns multiple values

td_ta_thr_iter:
- walks linked lists in inferior
- calls a user-supplied function: (SYSTEM FUNC CALLS VIA ARG)
Args:
- (supposedly) opaque (to caller) thread handle
- opaque to libthread_db void *callback_data (just passed in)
Returns:
- just some int to indicate an error
- also calls ps_getpid (could easily be passed as an argument)

td_thr_get_info:
- writes into a structure passed as an argument
(probably no portable way to do this -- sizes, offsets different)
maybe just return a lot of values
- also calls ps_getpid (less obvious as an argument but why not?)

td_thr_tls_get_addr:
- accesses link map (which is not in the same solib... need an
INFINTY_EXPORT there)
- basically a wrapper for td_thr_tlsbase (I8FUNC-I8FUNC FUNC CALLS)

td_thr_tlsbase:
- Calls __td_ta_lookup_th_unique (part of map_lwp2thr!)
- dtv_slotinfo???

Basically need to have functions as first class objects.
All the function calls could be rolled into DW_OP_GNU_i8call

Need some (convention? enforcement?) to say "these functions
are internal"

I think it has to be convention, there's no way to group functions
together other than by provider (no cross-provider calling of internal
functions? but we're allowing cross-provider symbol reference so...)

Thread handles from td_ta_map_lwp2thr/td_ta_thr_iter and to
td_thr_get_info will likely be just the th_unique member of
td_thrhandle_t.
2 changes: 2 additions & 0 deletions FUTURE
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
If x86_64 map_lwp2thr can be rewritten using DW_OP_bregx %fs.base, 0
then that could be a significant performance boost.
37 changes: 37 additions & 0 deletions HELLO
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
thread_db returns td_thrhandle_t objects as handles
these are NOT OPAQUE
although the declaration says it is:

/* The actual thread handle type. This is also opaque. */
typedef struct td_thrhandle
{
td_thragent_t *th_ta_p;
psaddr_t th_unique;
} td_thrhandle_t;

go figure

GDB doesn't touch either, except one inconsequential time:

molly:[src]$ grep -r 'th_\(ta_p\|unique\)' gdb
gdb/linux-thread-db.c: th.th_unique = 0;
gdb/nat/glibc_thread_db.h: td_thragent_t *th_ta_p;
gdb/nat/glibc_thread_db.h: psaddr_t th_unique;

So we can return just "psaddr_t th_unique" and use that as our "handle"

---

For td_thr_get_info GDB passes in a pointer to a td_thrinfo_t structure.
(which has lots of elements, grrr)
return a ton of values and let the caller fill in the struct?
this is where referencing functions by typed names will be handy...
for cross-core debugging the struct size&offsets might not be the same, so...

infinity-thread-db.c
collects new func notifications (per inferior? per whatever makes sense
for thread-db?)
and provides 4 i8_thread_db_* functions which wrap all this
(into td_thrhandle_t stuff too?
with same signatures as libthread_db funcs)

21 changes: 21 additions & 0 deletions HOOKIN
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
infinity-thread-db.c will enable thread-debugging per-pspace,
linux-thread-db.c tracks libthread_db instances by PID:

static struct thread_db_info *get_thread_db_info (int pid)

What I did in the original gbenson/infinity branch was right.

* Have infinity-thread-db.c watch functions arriving and collect them
in some per-pspace struct (which has some is_complete field which
the i8func observers maintain).

* Whenever linux-thread-db.c sees a new objfile (hopefully AFTER
infinity(-notes).c saw it!) it's thread_db_new_objfile observer
calls check_for_thread_db which calls thread_db_load.

* thread_db_load can grab a (fake, per-pid/per-pspace) "dlopen" handle
from infinity-thread-db.c that it can pass to thread_db_load_* as
before. Also infinity-thread-db.c should have some hacky dlsym
and dlclose wrappers. (maybe dlopen too, for completeness).

Boom, done.
120 changes: 120 additions & 0 deletions NOTES
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
accessor for symbol slots (requires address size baked into func?)

dwarf2expr.c
============

from execute_stack_op:
/* Old-style "untyped" DWARF values need special treatment in a
couple of places, specifically DW_OP_mod and DW_OP_shr. We need
a special type for these values so we can distinguish them from
values that have an explicit type, because explicitly-typed
values do not need special treatment. This special type must be
different (in the `==' sense) from any base type coming from the
CU. */
-> DWARF is typed?

case DW_OP_bregx:
op_ptr = safe_read_uleb128 (op_ptr, op_end, &reg);
op_ptr = safe_read_sleb128 (op_ptr, op_end, &offset);
result = (ctx->funcs->read_addr_from_reg) (ctx->baton, reg);
result += offset;
result_val = value_from_ulongest (address_type, result);
...
dwarf_expr_push (ctx, result_val, in_stack_memory);

dwarf2loc.c
===========

static CORE_ADDR
dwarf_expr_read_addr_from_reg (void *baton, int dwarf_regnum)
{
struct dwarf_expr_baton *debaton = (struct dwarf_expr_baton *) baton;
struct gdbarch *gdbarch = get_frame_arch (debaton->frame);
int regnum = gdbarch_dwarf2_reg_to_regnum (gdbarch, dwarf_regnum);

return address_from_register (regnum, debaton->frame);
}

amd64-tdep.c
============
oh! oh! this isn't in gdbserver!

set_gdbarch_dwarf2_reg_to_regnum (gdbarch, amd64_dwarf_reg_to_regnum);

...

/* DWARF Register Number Mapping as defined in the System V psABI,
section 3.6. */
static int amd64_dwarf_regmap[] =...


http://www.x86-64.org/documentation/abi.pdf
===========================================

Figure 3.36: DWARF Register Number Mapping:

Register Name Number Abbreviation
------------------------------------------
Segment Register FS 54 %fs
Segment Register GS 55 %gs
...
FS Base address 58 %fs.base
GS Base address 59 %gs.base


DWARF4 spec
===========

"The DW_OP_plus_uconst operation pops the top stack entry, adds it to
the unsigned LEB128 constant operand and pushes the result. This
operation is supplied specifically to be able to encode more field
offsets in two bytes than can be done with “DW_OP_litn DW_OP_plus”."

https://en.wikipedia.org/wiki/LEB128
====================================

To encode a number using LEB128:
1) a) zero extend the number up to a multiple of 7 bits (for ULEB128)
b) sign extend the number up to a multiple of 7 bits (for SLEB128)
2) Break the number up into groups of 7 bits.
3) Output one encoded byte for each 7 bit group, from least
significant to most significant group.
4) Set the most significant bit on each byte except the last byte.

So
==
"breg 0(%fs)" -> DW_OP_bregx, 54, 0
w00t :)


Oh
==
There's *lots* of GDB in dwarf2expr
but, see dwarf2_compile_expr_to_ax in dwarf2loc.c
(we would have to disable anything that used the
struct dwarf2_per_cu_data *per_cu argument,
it's opaque to dwarf2read.c)
Could we compile dwarf to ax and squirt that at gdbserver?
(then all the common-infinity stuff could go up to GDB *sigh*

To do
=====
1. put actual bytecode into glibc's map_lwp2thr
2. execute it hackily in GDB side-by-side with the real func
3. try converting to ax and squirting that at gdbserver??
(see above re disabling ops that use per_cu)

Might be easier to write a second dwarf interpreter
===================================================
- existing one is pretty heavyweight (lots of GDB in there,
gdbarch
tdep for register numbers
values
and lots of DWARF that doesn't exist for just some bytecode)
Also, it's designed to execute millions of expressions once
whereas I need something to execute a few expressions
millions of times so some pre-execution analysis might help
eg I can allocate memory (which would slow the other down)
and pre-read all the LEB128, do stack overflow checks, etc

- can print disassembly in pre-analysis for debugging!
18 changes: 18 additions & 0 deletions THOUGHTS
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
make function names provider::name(2)0 or provider::name(ai)ia ?
- allows for different versions of the same function (with different
args, rets)
- conveniently uses four extra bytes in each symbol table slot
(reserved for function typearray offsets if we allow for functions
calling functions in the future... notes with nonzero in them can
for now be skipped as unhandled)
- allows consumers to select functions with one strcmp
- allows for pre-processing arguments and post-processing returns
(e.g. will they need widening, narrowing, sign-extension, that shit?)

rename i8_make_fullname as symbol_note_make_fullname,
and replace provider, name in infinity_function with fullname

make the two slots (and num_args, num_rets)
be string table offsets
(empty one is easy: the end of another string??,
or a special value (0xffff???->0x0000 is a valid string table offset!))

0 comments on commit a6f7459

Please sign in to comment.