forked from jasontibbitts/majordomo
-
Notifications
You must be signed in to change notification settings - Fork 1
/
TODO.20001025
690 lines (528 loc) · 25.5 KB
/
TODO.20001025
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
-*- Text -*-
Roadmap
-------
The primary features holding up a true release are Bounce handling and a
web interface.
Simplistic bounce handling is a relatively easy project; just write a
module that looks for bounces and tap into the _owner function. Code for
the module is already written. Some additional support is needed, though;
the user-id stuff and the support for it in the envelope generator is
essential in making the system foolproof.
There are two basic ways to do the web interface: build up a set of small
CGIs that perform basic functions (subscribe, unsubscribe, get info, etc.)
or port MajorCool. The latter is pretty tough given the perl4/perl5
impedance mismatch but could pay off in a big way.
Bugs?
-----
Messages with empty first part always checksum the same?
The interface can pass bad passwords to some functions (config_get_vars)
and get a result indistinguishable from using no password.
Untainting sometimes happens in the wrong place; in a C/S model, we won't
trust what comes in over the network.
Requests from SRE
-----------------
Add an 'address' argument to the inform variable; address to send
information to.
Doc changes needed
------------------
(none)
Top of the list
---------------
List converter:
if -digest, offer to create non-digest version and add subscribers to digest.
New stored token format:
Data::Dumper to store a hash. The contents are essentially opaque to the
token system; upon extraction certain elements are passed to the bottom
half in order.
Pass token routines a hash. Store it verbatim in the database. Note
that we can't use SimpleDB for this because the stringification breaks
opacity.
Token::t_add needs to take a hash. In fact, it shouldn't even look at
its hash, it should just store it.
Token::t_remove should just return the hash uninterpreted.
We do want speed, so Storable is warranted, but not for token storage and
we require Data::Dumper for other things anyway. For the list stuff, it is
nice to be able to compare without unstringifying, so it is probably
simpler to keep the current ^A and ^B separators to get a fast two-level
stringifier and just use Data::Dumper for when we want to store a hash.
Convert the following to take hashes:
Access::_a_deny, _a_denymess, _a_allow, _a_conf_cons, _a_confirm,
_a_consult, _a_forward, _a_reply, _a_replyfile, _a_mailfile, _a_default
Access::_d_advertise, _d_password, _d_post, _d_subscribe
Need to pass additional context-specific data to consult/confirm routines.
First make Token routines take hashes, then pass an additional hashref of
subs. Then Access::_a_* should accept this and pass it on.
Mj::TokenDB - need some kind of multilevel structure and a way to store
arbitrary data in a token. Perhaps can the entire database thing and start
from scratch with a DB-based system. Use Storable?
Have a separate file for post consultation/confirmation reminders.
Queueing:
Runners automatically exit after processing some number of messages (to
prevent the accumulation of memory leaks).
Server does the same, only some much larger number as it does no
allocations.
Upon start of processing, move message into 'hold queue'. If not
deleted or moved elsewhere, there must have been a crash of some kind.
Have trigger process deal with these by making specific debug files and
storing them along with the generated debug files in some specific place.
Allow mj_email/mj_enqueue to be setuid under qmail, if the user wants it.
Make sure cached configurations don't persist past config file changes.
Add option to set umask of generated alias files.
Don't show the user variables that they can't change. (Make their values
visible to the interface but don't have configshow display them).
Move all aliases into one file, and all vut entries into one file, to make
sendmail configuration easier.
Move t_accept messages into external files.
Return additional information from _post so that t_accept can provide
additional data.
Update FILES.
Delay allocation of Dest/Sorter objects until something actually goes into
them.
Make Deliver take a list of delivery hashes, so you can pass in a whole
pile of class/message sets (including multiple files to the same class,
as long as they're in a different delivery hash). This would allow a
queue runner to deliver multiple messages to one list with only one
database scan.
Do more substitutions in mailfile actions. Allow a hashref of
substitutions to be passed in.
Move to a more module-based system; allow various filtering pieces of resend:
mime filters
checksumming
taboo/admin
to be moved into modules; have per-message modules and per-line modules.
Allow modules to be added; have a variable listing modules to be run.
Define module interface. Move lesser-used commands out of Majordomo.pm
(and Format.pm) and into separate modules.
! Digest:
Way to force the users to use the digest version:
allowed_subscribtion_classes - array of names of subscription classes that
users can join without owner's password approval. 'digest' covers all
digests, 'digest-blah' allows only 'blah' digest.
Access: call Mj::is_subscriber instead of List::is_subscriber.
Check validity of addresses in Resend::post and pass the results in as
access variables.
General munging framework (as part of _post):
Make only one pass over body.
Trim approvals, attach fronter/footer, do HTML->text conversion and
munging all at once.
Get back archive and (4) normal entities.
Structure (munge body first):
Traverse mime tree.
Delete parts if necessary.
Flatten multipart/alternative
Flatten single-part multiparts
-Now we know if we need to attach fronters/footers or munge them in-
Convert parts (richtext/html) to text, possibly munging.
Add fronters/footers and munge.
Munge headers.
Delete headers.
Reply-To:, subject_prefix
F'ter stuff:
Since _add_fters is the only copy operation in the normasl path, overload
it to do proposed elimination of various body lines (rename to
_munge_body).
Add /^-/ intelligence to embed newlines in a string_2darray.
If fronter/footer contains only FILE-some/file/path then pull in that file
to use as the fronter/footer and use its MIME data for the attachment
info (i.e. enable easy HTML attachment footers)
Need a variable to always attach fronters/footers.
! Auto-maintain aliases:
Call out to commands to rebuild aliases and VUT files.
! Archiver:
Build index from existing file (sync)
Delete message from archive?
Automatically add meaningful description to created archive files.
Automatically create a symlink from archive_dir to files/public/archive
if necessary.
Search the archive
How to specify the time period here? Two dates? Leave one and it goes
from then to now.
! Utility method to set the default language from the interface.
Check the user's chosen language. Should the language used be that of
the victim or the requester? The requester will see the immediate
output, while the victim will receive instructions. Or look up the
language preference of the user when sending out end-user information.
- Extension mechanism: one file, in $::TOPDIR/$domain, called
local_extensions.pl. Pulled in after all properties are defined and
mj_cf_data is loaded (so the two important hashes have been defined).
Provide methods to call in local_extensions.pl:
add_feature: adds a feature to the feature list (to advertise the
extension if necessary)
add_command: adds a command to the big function structure. Takes lots of
data to be filled in.
add_alias: adds an alias for a command (localization, I suppose).
add_variable: pushes a variable into the big config hash.
Adding a function: need three functions:
output formatter
top half (if not generic)
bottom half
plus data in CommandProps
These have to go in the proper packages. Rely on the writer to get this right?
- Add generic top half functionality, now that the dispatcher does so much.
Would eliminate top halves for:
alias (with additional parameter for validating additional positional
arguments as addresses.
auxadd
auxremove
auxwho_start
register
rekey
set
show
showtokens
unalias
Add to dispatcher:
exactly which positional arguments to validate as addresses.
generic_top_half item
Write generic_top_half function:
Log in; display some args
Make list if necessary.
Call access check.
Log result.
Call bottom half if necessary.
! Dispatch should log password failures for overriding logging.
- Make 'default' work in interactive shell.
! Investigate other locking methods.
! Make sure that $SENDER is reset properly after _trim_approval is called,
so we pick up the real sender, not the approver.
! Make sure all sent mails have To: headers.
! Continue generalization of database mechanism.
Write generic export and import functions.
Write autoconversion function.
Write backends: BerkeleyDB, MySQL.
! Finish archive interface.
! Make sure that archives get a separate umask, so that people can
download them.
- Add Makefile.PL section to look for features in addition to
prerequisites; if the necessary modules are present, the features will be
enabled.
Features: Time::HiRes
DB_File (or BerkeleyDB?)
whatever the MySQL module is
- Tool to incorporate mbox files into an archive.
Can run from alias (like archive2.pl) or can slurp an entire folder.
Can take a list (in which case it sucks things from the list's config) or
the necessary variables (dir, split, size) directly.
Import from files:
mj_archive -d domain -l list file1 file2
In an alias (read 1 message from stdin):
mj_archive -d domain -l list
Without a list, import from files:
mj_archive -r directory -p split -z size file1 file2
Without a list, import from stdin:
mj_archive -r directory -p split -z size
! Make mail_message go through the same delivery functions as normal
delivery.
! Add a banned_domains list; just ignore connections from these hosts.
- Implement rejection (and perhaps acceptance) messages for tokens, and add
a box to type them in to the web interface.
? Move spool files out of filespace like sessions? Use filespace only for
things needing remote access (not spooled files) or things needing i18n
(not spooled files) or things you want an index of (perhaps spooled
files). Maybe an open token report is warranted.
! Code to force 2047-encoding of headers. Function should take an entity
and a charset and call MIME::Words::encode_mimewords on each header.
What will e_mw do with embedded newlines? If the charset is not provided
(undef), the routine should try to figure one out from the charset of the
body, or if one can't be figured out, it should use ISO-8859-1.
! Implement separate user-readable and owner-readable bounce reasons.
! Format: finish rewriting the following
filesync
configshow
configdef
- allow mime-match classes just like taboo classes.
- rewrite lots of routines to take named arguments instead of positionals.
- Allow end user to set language preference. Add language choice as extra
parser variable. Export a core set_language routine. Reference a
Majordomo class variable inside of internationalization routines.
- Resend:
Front-end filtering stuff -
header improprieties (not in To or CC header, etc.)
Back end mime-modification functions come later.
- Add some variable expansions in message_fronters/footers like in
message_headers.
! Write up truncation logic to keep envelope sizes down for bounce probes.
- inform and consult uses moderators variable, listing named groups of
moderators:
namea
addra1
addra2
nameb
addrb1
addrb2
- list owner information (on hold until I collect more logs to process)
A 'report' command used to give several list metrics:
summary of activity (from inform_owner routine)
message statistics generated from the archive index (from Dave's
stats code)
symmary of existing tokens
Two reports:
list report: summary of activity for one list. Includes detailed
breakdowns on who did what to whom. Categorize all subscriptions,
all unsubscriptions, and indeed all activity broken down per
request type. Summarize activity at beginning of report. Include
posting activity at a later date, since this is logged separately.
(Should this be logged separately?)
Shows all open, non-permanent tokens for the list.
GLOBAL report: report on all things global. Give a breakdown on all
GLOBAL commands and tokens. No postings here. Also give summary
of all requests broken down by list.
Command takes zero, one or two dates. No dates makes a report since the
last no-date report. One date makes a report from that date forward.
Two dates makes a report between the two dates.
Modes can be used to specify only tokens, only actions or only traffic
(or notokens, notraffic, noactions) and only failed, only stalls, only
success, only passworded, etc.
- subscriber flag "novice/expert" passed in as a parameter to "post" access
control.
- Rewrite shell.t to just build the file, call mj_shell once (or perhaps
just a few times) and check the output. This should improve test times
quite a bit.
- RBL/ORBS/DUL queries:
parse all Received: headers, look for IP addresses.
Look them up in the RBL..
Add an exemption list (addrs? IPs?)
Set access variable; bounce to owner by default if set.
Don't ever do any checks if some config variable not set.
Well thought out stuff:
----------------------
- Internationalization. File search list is already done. Have the core
provide internationmalization functions. Place, in the normal filespace,
a file containing translations for messages too short to warrant their
own files. Use the current file search list to locate this file, then
read it into a per-list data structure. The file's format should look
something like this:
# Comments
language
message_tag
Translation $A of $C tag $B!
another_message_tag
Translation Line 1
$A $B $B line 2
[...]
The data structure looks like this:
$self->{trans}{$lang} =
{
message_tag => 'Translation string',
[...]
};
Variables to be replaced in the string are of the form $A, $B, etc.
Translstion involves looking up the message tag, making the substitution
with the given variables and returning the string.
If the necessary translation tag does not exist, what to do? Back off to
the global entry under the same language, then to English, and then to
some default string? Or try to "fill in the blanks" (parse another file,
pulling in only those language tags not in the data structure) from the
next thing in the search list?
Right now, back off to global makes the most sense.
The data structure can also hold a timestamp; the search path is
consulted to find the first matching file which is then statted and
compared; if it has changed (at all, perhaps only due to a changed search
list finding an older file), it is reparsed.
The stored translation data structure are stored (via Date::Dumper) in
_trans in the list's root.
The default English translations can be placed in the code and used on
the client side.
From a message to mj-translators:
What remains is the language preference stuff (the file retrieval mechanism
can deal with a preferred language but nothing ever passes it one) and a
method for dealing with the short responses.
Short responses are the one or two line messages scattered throughout the
code. The way to handle it will be to have an internal three-level hash
keyed on language, list and 'message tag' which will be loaded
(incrementally, as languages are needed) from message files. Those files
will be retrieved like any response file, which allows global defaults and
per-list customization even by remote list owners.
The messages in the code will need to be identified and replaced with
'message tags' concatenated with data; the interface code can take the tags
and data, look things up in the message hash, expand embedded variables in
the returned strings and present the result (prettied and wrapped,
probably) to the user.
In other words, if in the code you see:
return (0, "User $user is a bozo because $list is not a valid list!\n");
and turn it into:
return (0, "user_is_bozo_bad_list\t$user\t$list");
The interface (really the code in Format.pm, which runs on the interface
side) looks up user_is_bozo_bad_list and gets:
User $A is a bozo because $B is not a valid list!\n
or
Yuusano $A ga bakadakara $B ga iyano risuto desu!\n
$A and $B are replaced by the values attached to the tag, resulting in the
original string (or a really bad but mildly polite Japanese translation of
it).
At least, that's the plan.
Ramp up to translation slowly:
'translate' core call.
One multilevel hash, static, in a require'd file:
%trans = (
'english' =>
{
'tag' => "Subst string",
}
)
Expand to hardcoded translations of other languages, fall back to English,
fall back to just printing the returned string (so if no messages are
changed, you always get English).
Interface needs to be able to tell the core what the user is or otherwise
figure out what the user's language preference is _if_ one is not passed to
the interface somehow (prefer-language: header, 'default language' command,
etc.) This gets pulled out of the registration database, which allows us
to not need a list object to get a language preference, which greatly
simplifies things.
Then the 'translate' call just takes a language parameter and do the proper
lookup.
_Then_ extend things to weird per-list loading _only if_ someone decided
they want it.
We already support translated long documents.
- Daemon model. Startup time is always going to plague us. The solution
is to eliminate it. Have daemons run in the background as necessary to
serve requests.
Ignore the complex stuff that was here; use one of the RPC modules instead.
- Figure out how to more closely link bounce reasons with access control
actions. Problem: we can only generate bounce reasons before we call
access_check. We can't generate all possible reasons and we might not
want to generate any. The owner can't choose which reasons to use, and
can't add any.
can add a clear_reasons action, and a reason action to erase the reasons
list and add additional bounce reasons. This makes the access_control
stuff look uglier than it needs to be, although continuation lines can
help.
- uniquify function names
Make function names unique to 8 characters, to appease AutoSplit.
- Verify that non-wrapper configuration is correct
create a dummy script, make it setuid, and run it. If it errors, we
know it's bad.
- store some data on each user-request so we can limit abusers. Basically
we want to store enough data to see if an address is sending too many
requests to us. a circular list of a fixed size would probably be good;
we save it between sessions and if an address shows up too many times we
set an access flag before running the access check.
One big global queue?
One queue per request?
One queue per list?
One queue per list per request?
Something else? Some kind of time average per address per request?
How do we allow things like someone who works offline and dumps a pile of
email every couple of days?
How do we back off when a user gets close to the limit? We don't want to
just start ignoring requests. We could sleep, then sleep some more,
then bounce, then reject, then ignore progressively as the traffic
increases.
How to make sure that a slow server that has only a few users doesn't
fill up a queue with only a few addresses causing them to bounce? Store
address and a timestamp and ignore/purge old requests?
How to store this data? Probably not in the config file.
Categorized ideas:
-----------------
Config Stuff
- passwords << asdf : !config (exclude a variable)
Email Interface
- prescan for bad situations: unmatched tokens, too many subscribes in one
message, etc.
- prescan for access restriction override.
- context-sensitive help: Format routines return list of help topics
with error; email interface looks up topics and mails back appropriate
help.
- make sure that no internal routine gets called with an undefined list:
"subscribe" sent with no deflist set.
Format
- info ALL for hyper-long lists output
- auxshow list ALL
Confirmation/Approval engine
- Ability to append or prepend information to approved messages
- list tokens, expire old tokens.
Resend
- MIME transformations
Mj::Log
- Log classes; too much debug info now
- Have aborts and warnings stuffed into some tempfile and process that file
after the fact. This allows aborts to propogate even if we've hosed the
mail delivery system.
Meta
- Add proper $VERSION tags for all modules.
- Figure out the proper names for functions and clean them all up.
- Figure out how do make all messages configurable.
- Make sure that addresses are validated/stripped as little as possible.
If a routine can return an error string and needs to strip the address,
it can save upstream functions the trouble.
- Try to avoid aborting when there is another way. Unfortunately, aborting
happens at the lowest levels of the code (that would have to be working
anyway in order to send abort messages).
- Make stub modules that inherit from the real modules so people have a
place to put their extension/overriding code without copying things
around.
GTK client
telnet client
Finish all commands
Figure out client-server spec.
Uncategorized random thinkos:
----------------------------
retrieve and replace individual subscriber field data, for hard-core admin
stuff.
Subscription policy patch -> sponsored subscribes.
archive search engine?
Integrate Dave's logmail stuff -> stats command.
Support for multipart/alternative when sending welcome messages.
Periodically sort lists for proper batching.
Spellcheck the documentation and messages.
PGP verification for email. How to trust a client?
Internally equate meta-lists * and ALL (valid_list)
Limit message delivery/email request processing to certain times/load
averages
Some way to store some data until the next time a user communicates with
us. So if, say, they are unsubscribed due to bounces but later try to
send a message or check their subscription status they will be informed.
Obviously we can't inform them when remove them, since that message would
bounce, too.
Objectify the parser, then register legal methods with it?
Bounce handling: hack on BounceFilter and AutoBounce.
Novice_level variable that affects getting documentation for config
variables? Novice set of config variables? Novice config documentation?
Expiration of subscriptions.
Last posting date.
Have who expand lists subscribed to current list.
Access_rules methods for checking against requesting user instead of victim.
Forward to other (or same) server, but change the list name.
-c option to mj_email (or new mj_command script) to always execute a
command when a mail is received.
Have 'help command xxx' look up config comment string for xxx.
Support X-No-Archive. _post should look for headers and pseudo-headers,
but not remove pseudos if found.
Createlist should mail information to the list owner. This should be
some distilled version of list-owner-info, containing pointers to
additional documents that can be retrieved with the help command.
Of course, all of these documents need to be written.
Add Majordomo version info to the X-Mailer header that MIME-Tools adds.
Make a separate repository for the backup default files so that we can
upload them without worrying about overwriting local modifications.
Handle stupid crap like "subscribe to list" and "subscribe list digest".
Some kind of pseudo-anonymizer so that users' addresses are replaced by
some anonymous token; when the message hits the archives, the headers are
replaced by this. This prevents harvesting of addresses from the
archives. Majordomo can handle the mapping between tokens and addresses.
Allow a 'default prefix' command to specify a prefix to strip from all
lines in the text parser. This allows replies to be yanked and quoted, as
long as they aren't line wrapped.
Add additional sender information to things like confirmations so that if
they bounce, the bounce processor can take care of the generated token and
not forward the bounce to the owner.
Allow access control language to make modifications to outgoing messages,
perhaps by passing a list of actions back to the access check routine
before the message is spooled.
Keep a list of banned domains, compare against them in resend and fail all
commands from them in the email interface (or, perhaps, in the
dispatcher).
Purge archive files after a certain period of time.
Make archiver add files into filespace with useful descriptions.
Allow headers to vary depending on access variables.
Have an action attach text to an outgoing message (warning about quotes,
deleted parts, etc.)
Always allocating address objects for every core call costs for iterators;
invent some handle method instead so we can cache state between calls.
If processing a command yields an invalid address error and following line
does not hold a legal command, join the two lines and try again.
Export information from the registration database to .htaccess files, so
owners can restrict access to web archives or whatever.
digest-headers variable?
Keep a count of addresses delivered to and return this out of the delivery
module.