[misc] various notes

gbenson · Nov 16, 2015 · a6f7459 · a6f7459
1 parent 1d44942
commit a6f7459
Show file tree

Hide file tree

Showing 6 changed files with 242 additions and 0 deletions.
diff --git a/FUNCTIONS b/FUNCTIONS
@@ -0,0 +1,44 @@
+td_ta_map_lwp2thr:
+ - always needs to call ps_getpid (currently passed as an argument)
+ - needs to access inferior memory (done)
+ - generally needs to call ps_get_thread_area (SYSTEM FUNC CALLS)
+ - returns multiple values
+
+td_ta_thr_iter:
+ - walks linked lists in inferior
+ - calls a user-supplied function:   (SYSTEM FUNC CALLS VIA ARG)
+   Args:
+     - (supposedly) opaque (to caller) thread handle
+     - opaque to libthread_db void *callback_data (just passed in)
+   Returns:
+     - just some int to indicate an error
+ - also calls ps_getpid (could easily be passed as an argument)
+
+td_thr_get_info:
+ - writes into a structure passed as an argument
+   (probably no portable way to do this -- sizes, offsets different)
+   maybe just return a lot of values
+ - also calls ps_getpid (less obvious as an argument but why not?)
+
+td_thr_tls_get_addr:
+ - accesses link map (which is not in the same solib... need an
+   INFINTY_EXPORT there)
+ - basically a wrapper for td_thr_tlsbase (I8FUNC-I8FUNC FUNC CALLS)
+
+td_thr_tlsbase:
+ - Calls __td_ta_lookup_th_unique (part of map_lwp2thr!)
+ - dtv_slotinfo???
+
+Basically need to have functions as first class objects.
+All the function calls could be rolled into DW_OP_GNU_i8call
+
+Need some (convention? enforcement?) to say "these functions
+are internal"
+
+I think it has to be convention, there's no way to group functions
+together other than by provider (no cross-provider calling of internal
+functions? but we're allowing cross-provider symbol reference so...)
+
+Thread handles from td_ta_map_lwp2thr/td_ta_thr_iter and to
+td_thr_get_info will likely be just the th_unique member of
+td_thrhandle_t.
diff --git a/FUTURE b/FUTURE
@@ -0,0 +1,2 @@
+If x86_64 map_lwp2thr can be rewritten using DW_OP_bregx %fs.base, 0
+then that could be a significant performance boost.
diff --git a/HELLO b/HELLO
@@ -0,0 +1,37 @@
+thread_db returns td_thrhandle_t objects as handles
+these are NOT OPAQUE
+although the declaration says it is:
+
+  /* The actual thread handle type.  This is also opaque.  */
+  typedef struct td_thrhandle
+  {
+    td_thragent_t *th_ta_p;
+    psaddr_t th_unique;
+  } td_thrhandle_t;
+
+go figure
+
+GDB doesn't touch either, except one inconsequential time:
+
+  molly:[src]$ grep -r 'th_\(ta_p\|unique\)' gdb
+  gdb/linux-thread-db.c:  th.th_unique = 0;
+  gdb/nat/glibc_thread_db.h:  td_thragent_t *th_ta_p;
+  gdb/nat/glibc_thread_db.h:  psaddr_t th_unique;
+
+So we can return just "psaddr_t th_unique" and use that as our "handle"
+
+---
+
+For td_thr_get_info GDB passes in a pointer to a td_thrinfo_t structure.
+(which has lots of elements, grrr)
+return a ton of values and let the caller fill in the struct?
+this is where referencing functions by typed names will be handy...
+for cross-core debugging the struct size&offsets might not be the same, so...
+
+infinity-thread-db.c
+collects new func notifications (per inferior? per whatever makes sense 
+for thread-db?)
+and provides 4 i8_thread_db_* functions which wrap all this
+(into td_thrhandle_t stuff too?
+with same signatures as libthread_db funcs)
+
diff --git a/HOOKIN b/HOOKIN
@@ -0,0 +1,21 @@
+infinity-thread-db.c will enable thread-debugging per-pspace,
+linux-thread-db.c tracks libthread_db instances by PID:
+
+  static struct thread_db_info *get_thread_db_info (int pid)
+
+What I did in the original gbenson/infinity branch was right.
+
+* Have infinity-thread-db.c watch functions arriving and collect them
+  in some per-pspace struct (which has some is_complete field which
+  the i8func observers maintain).
+
+* Whenever linux-thread-db.c sees a new objfile (hopefully AFTER
+  infinity(-notes).c saw it!) it's thread_db_new_objfile observer
+  calls check_for_thread_db which calls thread_db_load.
+
+* thread_db_load can grab a (fake, per-pid/per-pspace) "dlopen" handle
+  from infinity-thread-db.c that it can pass to thread_db_load_* as
+  before.  Also infinity-thread-db.c should have some hacky dlsym
+  and dlclose wrappers. (maybe dlopen too, for completeness).
+
+Boom, done.
diff --git a/NOTES b/NOTES
@@ -0,0 +1,120 @@
+accessor for symbol slots (requires address size baked into func?)
+
+dwarf2expr.c
+============
+
+from execute_stack_op:
+  /* Old-style "untyped" DWARF values need special treatment in a
+     couple of places, specifically DW_OP_mod and DW_OP_shr.  We need
+     a special type for these values so we can distinguish them from
+     values that have an explicit type, because explicitly-typed
+     values do not need special treatment.  This special type must be
+     different (in the `==' sense) from any base type coming from the
+     CU.  */
+  -> DWARF is typed?
+
+case DW_OP_bregx:
+    op_ptr = safe_read_uleb128 (op_ptr, op_end, &reg);
+    op_ptr = safe_read_sleb128 (op_ptr, op_end, &offset);
+    result = (ctx->funcs->read_addr_from_reg) (ctx->baton, reg);
+    result += offset;
+    result_val = value_from_ulongest (address_type, result);
+    ...
+    dwarf_expr_push (ctx, result_val, in_stack_memory);
+
+dwarf2loc.c
+===========
+
+static CORE_ADDR
+dwarf_expr_read_addr_from_reg (void *baton, int dwarf_regnum)
+{
+  struct dwarf_expr_baton *debaton = (struct dwarf_expr_baton *) baton;
+  struct gdbarch *gdbarch = get_frame_arch (debaton->frame);
+  int regnum = gdbarch_dwarf2_reg_to_regnum (gdbarch, dwarf_regnum);
+
+  return address_from_register (regnum, debaton->frame);
+}
+
+amd64-tdep.c
+============
+  oh! oh! this isn't in gdbserver!
+
+  set_gdbarch_dwarf2_reg_to_regnum (gdbarch, amd64_dwarf_reg_to_regnum);
+
+  ...
+
+  /* DWARF Register Number Mapping as defined in the System V psABI,
+   section 3.6.  */
+  static int amd64_dwarf_regmap[] =...
+
+
+http://www.x86-64.org/documentation/abi.pdf
+===========================================
+
+  Figure 3.36: DWARF Register Number Mapping:
+
+    Register Name    	 Number   Abbreviation
+    ------------------------------------------
+    Segment Register FS      54   %fs
+    Segment Register GS	     55   %gs
+    ...
+    FS Base address          58   %fs.base
+    GS Base address          59   %gs.base
+
+
+DWARF4 spec
+===========
+
+  "The DW_OP_plus_uconst operation pops the top stack entry, adds it to
+  the unsigned LEB128 constant operand and pushes the result.  This
+  operation is supplied specifically to be able to encode more field
+  offsets in two bytes than can be done with “DW_OP_litn DW_OP_plus”."
+
+https://en.wikipedia.org/wiki/LEB128
+====================================
+
+To encode a number using LEB128:
+ 1) a) zero extend the number up to a multiple of 7 bits (for ULEB128)
+    b) sign extend the number up to a multiple of 7 bits (for SLEB128)
+ 2) Break the number up into groups of 7 bits.
+ 3) Output one encoded byte for each 7 bit group, from least
+    significant to most significant group.
+ 4) Set the most significant bit on each byte except the last byte.
+
+So
+==
+  "breg 0(%fs)" -> DW_OP_bregx, 54, 0
+w00t :)
+
+
+Oh
+==
+There's *lots* of GDB in dwarf2expr
+but, see dwarf2_compile_expr_to_ax in dwarf2loc.c
+(we would have to disable anything that used the
+struct dwarf2_per_cu_data *per_cu argument,
+it's opaque to dwarf2read.c)
+Could we compile dwarf to ax and squirt that at gdbserver?
+(then all the common-infinity stuff could go up to GDB *sigh*
+
+To do
+=====
+1. put actual bytecode into glibc's map_lwp2thr
+2. execute it hackily in GDB side-by-side with the real func
+3. try converting to ax and squirting that at gdbserver??
+   (see above re disabling ops that use per_cu)
+
+Might be easier to write a second dwarf interpreter
+===================================================
+- existing one is pretty heavyweight (lots of GDB in there,
+   gdbarch
+   tdep for register numbers
+   values
+  and lots of DWARF that doesn't exist for just some bytecode)
+  Also, it's designed to execute millions of expressions once
+  whereas I need something to execute a few expressions
+  millions of times so some pre-execution analysis might help
+  eg I can allocate memory (which would slow the other down)
+  and pre-read all the LEB128, do stack overflow checks, etc
+
+- can print disassembly in pre-analysis for debugging!
diff --git a/THOUGHTS b/THOUGHTS
@@ -0,0 +1,18 @@
+make function names provider::name(2)0 or provider::name(ai)ia ?
+- allows for different versions of the same function (with different
+  args, rets)
+- conveniently uses four extra bytes in each symbol table slot
+  (reserved for function typearray offsets if we allow for functions
+  calling functions in the future... notes with nonzero in them can
+  for now be skipped as unhandled)
+- allows consumers to select functions with one strcmp
+- allows for pre-processing arguments and post-processing returns
+  (e.g. will they need widening, narrowing, sign-extension, that shit?)
+
+rename i8_make_fullname as symbol_note_make_fullname,
+and replace provider, name in infinity_function with fullname
+
+make the two slots (and num_args, num_rets)
+be string table offsets
+(empty one is easy: the end of another string??,
+ or a special value (0xffff???->0x0000 is a valid string table offset!))
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		If x86_64 map_lwp2thr can be rewritten using DW_OP_bregx %fs.base, 0
		then that could be a significant performance boost.