forked from SeattleTestbed/seattlelib_v1
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathbundle.repy
1615 lines (1218 loc) · 43.8 KB
/
bundle.repy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
'''
<Module name>
bundle.repy
<Purpose>
Bundles simplify the transferring of repy programs and associated data to and
from vessels. A bundle is a self-extracting repy program that contains a
repy program and embedded files that the contained program depends on.
Bundles have a .bundle.repy extension.
Embedded files within a bundle are extracted into the local file system
before the flow of execution reaches the contained program. Bundles do not
necessarily have to contain a repy program, and can be used solely to pack
data into a single unit.
This module provides a Python file-like interface to manipulating these
bundles. You may perform the following actions on bundles:
- Create a new bundle
- Add files to/Remove files from a bundle
- Extract files from a bundle
- Show a bundle's contents
- Wipe a bundle's contents
For more usage information, see the Example Usage section below, or consult
the online wiki page on the bundle API:
https://seattle.cs.washington.edu/wiki/SeattleLib/bundle.repy
This is the basic file structure of a repy bundle:
----------
Auto-extracting Code for Bundles
...
Bundled program content, if specified on bundle creation
...
"""
Data for first file
Data for second file
Data for third file
...
Metadata
Metadata length
"""
----------
The contents of the bundle are stored in the metadata section of the file.
Metadata that is stored is the following:
File location (relative to the beginning of the file)
File length (in chars)
The metadata length is used to locate the metadata. The repy file object
does not provide information on how large the file is. In order to simplify
the loading of the file metadata, a set amount of space is reserved for this
metadata length value. The location of the metadata can then be used to
locate the metadata relative to the pointer's location when reading the
metadata length.
<Example usage>
This program is meant to be used as a module. To use it directly from the
command line, see bundler.py
# Creating a bundle
mybundle = bundle_bundle('my.bundle', 'w')
mybundle.add('log1')
mybundle.add('log2')
mybundle.close()
# Modifying the bundle
mybundle = bundle_bundle('my.bundle', 'a')
mybundle.remove('log2')
mybundle.add('replacement')
mybundle.close()
# Reading data from the bundle
mybundle = bundle_bundle('my.bundle', 'r')
mybundle.list()
log1contents = mybundle.extract_to_string('log1')
mybundle.extract_all()
# You can now read from log2
log2contents = open('log2').read()
...
'''
#begin include base64.repy
"""
<Program Name>
$Id: base64.repy 2527 2009-07-26 22:48:38Z cemeyer $
<Started>
April 12, 2009
<Author>
Michael Phan-Ba
<Purpose>
Provides data encoding and decoding as specified in RFC 3548. This
module implements a subset of the Python module base64 interface.
b32encode(), b32decode(), b16encode(), b16decode(), decode(),
decodestring(), encode(), and encodestring() are not currently
implemented.
<Changes>
2009-04-12 Michael Phan-Ba <mdphanba@gmail.com>
* Initial release
2009-05-23 Michael Phan-Ba <mdphanba@gmail.com>
* (b64encode, b64decode, standard_b64encode, standard_b64decode,
urlsafe_encode, urlsafe_decode): Renamed functions with base64 prefix
2009-05-24 Michael Phan-Ba <mdphanba@gmail.com>
* Set property svn:keyword to "Id"
"""
# The Base64 for use in encoding
BASE64_ALPHABET = \
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
def base64_b64encode(s, altchars=None):
"""
<Purpose>
Encode a string using Base64.
<Arguments>
s:
The string to encode.
altchars:
An optional string of at least length 2 (additional characters are
ignored) which specifies an alternative alphabet for the + and /
characters. The default is None, for which the standard Base64
alphabet is used.
<Exceptions>
None.
<Side Effects>
None.
<Returns>
The encoded string.
"""
# Build the local alphabet.
if altchars is None:
base64_alphabet = BASE64_ALPHABET
else:
base64_alphabet = BASE64_ALPHABET[:62] + altchars
# Change from characters to integers for binary operations.
bytes = []
for x in s:
bytes.append(ord(x))
# Encode the 8-bit words into 6-bit words.
x6bit_words = []
index = 0
while True:
# Encode the first 6 bits from three 8-bit values.
try:
x8bits = bytes[index]
except IndexError:
break
else:
x6bits = x8bits >> 2
leftover_bits = x8bits & 3
x6bit_words.append(base64_alphabet[x6bits])
# Encode the next 8 bits.
try:
x8bits = bytes[index + 1]
except IndexError:
x6bits = leftover_bits << 4
x6bit_words.extend([base64_alphabet[x6bits], "=="])
break
else:
x6bits = (leftover_bits << 4) | (x8bits >> 4)
leftover_bits = x8bits & 15
x6bit_words.append(base64_alphabet[x6bits])
# Encode the final 8 bits.
try:
x8bits = bytes[index + 2]
except IndexError:
x6bits = leftover_bits << 2
x6bit_words.extend([base64_alphabet[x6bits], "="])
break
else:
x6bits = (leftover_bits << 2) | (x8bits >> 6)
x6bit_words.append(base64_alphabet[x6bits])
x6bits = x8bits & 63
x6bit_words.append(base64_alphabet[x6bits])
index += 3
return "".join(x6bit_words)
def base64_b64decode(s, altchars=None):
"""
<Purpose>
Decode a Base64 encoded string. The decoder ignores all non
characters not in the Base64 alphabet for compatibility with the
Python library. However, this introduces a security loophole in
which covert or malicious data may be passed.
<Arguments>
s:
The string to decode.
altchars:
An optional string of at least length 2 (additional characters are
ignored) which specifies an alternative alphabet for the + and /
characters. The default is None, for which the standard Base64
alphabet is used.
<Exceptions>
None.
<Side Effects>
TypeError on decoding error.
<Returns>
The decoded string.
"""
# Build the local alphabet.
if altchars is None:
base64_alphabet = BASE64_ALPHABET
else:
base64_alphabet = BASE64_ALPHABET[:62] + altchars
# Generate the translation maps for decoding a Base64 string.
translate_chars = []
for x in xrange(256):
char = chr(x)
translate_chars.append(char)
# Build the strings of characters to delete.
delete_chars = []
for x in translate_chars:
if x not in base64_alphabet:
delete_chars.append(x)
delete_chars = "".join(delete_chars)
# Insert the 6-bit Base64 values into the translation string.
k = 0
for v in base64_alphabet:
translate_chars[ord(v)] = chr(k)
k += 1
translate_chars = "".join(translate_chars)
# Count the number of padding characters at the end of the string.
num_pad = 0
i = len(s) - 1
while i >= 0:
if s[i] == "=":
num_pad += 1
else:
break
i -= 1
# Translate the string into 6-bit characters and delete extraneous
# characters.
s = s.translate(translate_chars, delete_chars)
# Determine correct alignment by calculating the number of padding
# characters needed for compliance to the specification.
align = (4 - (len(s) & 3)) & 3
if align == 3:
raise TypeError("Incorrectly encoded base64 data (has 6 bits of trailing garbage)")
if align > num_pad:
# Technically, this isn't correctly padded. But it's recoverable, so let's
# not care.
pass
# Change from characters to integers for binary operations.
x6bit_words = []
for x in s:
x6bit_words.append(ord(x))
for x in xrange(align):
x6bit_words.append(-1)
# Decode the 6-bit words into 8-bit words.
bytes = []
index = 0
while True:
# Work on four 6-bit quantities at a time. End when no more data is
# available.
try:
(x6bits1, x6bits2, x6bits3, x6bits4) = x6bit_words[index:index + 4]
except ValueError:
break
# Save an 8-bit quantity.
bytes.append((x6bits1 << 2) | (x6bits2 >> 4))
# End of valid data.
if x6bits3 < 0:
break
# Save an 8-bit quantity.
bytes.append(((x6bits2 & 15) << 4) | (x6bits3 >> 2))
# End of valid data.
if x6bits4 < 0:
break
# Save an 8-bit quantity.
bytes.append(((x6bits3 & 3) << 6) | x6bits4)
# Next four 6-bit quantities.
index += 4
return "".join([chr(x) for x in bytes])
def base64_standard_b64encode(s):
"""
<Purpose>
Encode a string using the standard Base64 alphabet.
<Arguments>
s:
The string to encode.
<Exceptions>
None.
<Side Effects>
None.
<Returns>
The encoded string.
"""
return base64_b64encode(s)
def base64_standard_b64decode(s):
"""
<Purpose>
Decode a Base64 encoded string using the standard Base64 alphabet.
<Arguments>
s:
The string to decode.
<Exceptions>
None.
<Side Effects>
TypeError on decoding error.
<Returns>
The decoded string.
"""
return base64_b64decode(s)
def base64_urlsafe_b64encode(s):
"""
<Purpose>
Encode a string using a URL-safe alphabet, which substitutes -
instead of + and _ instead of / in the standard Base64 alphabet.
<Arguments>
s:
The string to encode.
<Exceptions>
None.
<Side Effects>
None.
<Returns>
The encoded string.
"""
return base64_b64encode(s, "-_")
def base64_urlsafe_b64decode(s):
"""
<Purpose>
Decode a Base64 encoded string using a URL-safe alphabet, which
substitutes - instead of + and _ instead of / in the standard Base64
alphabet.
<Arguments>
s:
The string to decode.
<Exceptions>
None.
<Side Effects>
TypeError on decoding error.
<Returns>
The decoded string.
"""
return base64_b64decode(s, "-_")
#end include base64.repy
#begin include serialize.repy
"""
Author: Justin Cappos
Start date: October 9th, 2009
Purpose: A simple library that serializes and deserializes built-in repy types.
This includes strings, integers, floats, booleans, None, complex, tuples,
lists, sets, frozensets, and dictionaries.
There are no plans for including objects.
Note: that all items are treated as separate references. This means things
like 'a = []; a.append(a)' will result in an infinite loop. If you have
'b = []; c = (b,b)' then 'c[0] is c[1]' is True. After deserialization
'c[0] is c[1]' is False.
I can add support or detection of this if desired.
"""
# The basic idea is simple. Say the type (a character) followed by the
# type specific data. This is adequate for simple types
# that do not contain other types. Types that contain other types, have
# a length indicator and then the underlying items listed sequentially.
# For a dict, this is key1value1key2value2.
def serialize_serializedata(data):
"""
<Purpose>
Convert a data item of any type into a string such that we can
deserialize it later.
<Arguments>
data: the thing to seriailize. Can be of essentially any type except
objects.
<Exceptions>
TypeError if the type of 'data' isn't allowed
<Side Effects>
None.
<Returns>
A string suitable for deserialization.
"""
# this is essentially one huge case statement...
# None
if type(data) == type(None):
return 'N'
# Boolean
elif type(data) == type(True):
if data == True:
return 'BT'
else:
return 'BF'
# Integer / Long
elif type(data) is int or type(data) is long:
datastr = str(data)
return 'I'+datastr
# Float
elif type(data) is float:
datastr = str(data)
return 'F'+datastr
# Complex
elif type(data) is complex:
datastr = str(data)
if datastr[0] == '(' and datastr[-1] == ')':
datastr = datastr[1:-1]
return 'C'+datastr
# String
elif type(data) is str:
return 'S'+data
# List or tuple or set or frozenset
elif type(data) is list or type(data) is tuple or type(data) is set or type(data) is frozenset:
# the only impact is the first letter...
if type(data) is list:
mystr = 'L'
elif type(data) is tuple:
mystr = 'T'
elif type(data) is set:
mystr = 's'
elif type(data) is frozenset:
mystr = 'f'
else:
raise Exception("InternalError: not a known type after checking")
for item in data:
thisitem = serialize_serializedata(item)
# Append the length of the item, plus ':', plus the item. 1 -> '2:I1'
mystr = mystr + str(len(thisitem))+":"+thisitem
mystr = mystr + '0:'
return mystr
# dict
elif type(data) is dict:
mystr = 'D'
keysstr = serialize_serializedata(data.keys())
# Append the length of the list, plus ':', plus the list.
mystr = mystr + str(len(keysstr))+":"+keysstr
# just plop the values on the end.
valuestr = serialize_serializedata(data.values())
mystr = mystr + valuestr
return mystr
# Unknown!!!
else:
raise TypeError("Unknown type '"+str(type(data))+"' for data :"+str(data))
def serialize_deserializedata(datastr):
"""
<Purpose>
Convert a serialized data string back into its original types.
<Arguments>
datastr: the string to deseriailize.
<Exceptions>
ValueError if the string is corrupted
TypeError if the type of 'data' isn't allowed
<Side Effects>
None.
<Returns>
Items of the original type
"""
if type(datastr) != str:
raise TypeError("Cannot deserialize non-string of type '"+str(type(datastr))+"'")
typeindicator = datastr[0]
restofstring = datastr[1:]
# this is essentially one huge case statement...
# None
if typeindicator == 'N':
if restofstring != '':
raise ValueError("Malformed None string '"+restofstring+"'")
return None
# Boolean
elif typeindicator == 'B':
if restofstring == 'T':
return True
elif restofstring == 'F':
return False
raise ValueError("Malformed Boolean string '"+restofstring+"'")
# Integer / Long
elif typeindicator == 'I':
try:
return int(restofstring)
except ValueError:
raise ValueError("Malformed Integer string '"+restofstring+"'")
# Float
elif typeindicator == 'F':
try:
return float(restofstring)
except ValueError:
raise ValueError("Malformed Float string '"+restofstring+"'")
# Float
elif typeindicator == 'C':
try:
return complex(restofstring)
except ValueError:
raise ValueError("Malformed Complex string '"+restofstring+"'")
# String
elif typeindicator == 'S':
return restofstring
# List / Tuple / set / frozenset / dict
elif typeindicator == 'L' or typeindicator == 'T' or typeindicator == 's' or typeindicator == 'f':
# We'll split this and keep adding items to the list. At the end, we'll
# convert it to the right type
thislist = []
data = restofstring
# We'll use '0:' as our 'end separator'
while data != '0:':
lengthstr, restofdata = data.split(':', 1)
length = int(lengthstr)
# get this item, convert to a string, append to the list.
thisitemdata = restofdata[:length]
thisitem = serialize_deserializedata(thisitemdata)
thislist.append(thisitem)
# Now toss away the part we parsed.
data = restofdata[length:]
if typeindicator == 'L':
return thislist
elif typeindicator == 'T':
return tuple(thislist)
elif typeindicator == 's':
return set(thislist)
elif typeindicator == 'f':
return frozenset(thislist)
else:
raise Exception("InternalError: not a known type after checking")
elif typeindicator == 'D':
lengthstr, restofdata = restofstring.split(':', 1)
length = int(lengthstr)
# get this item, convert to a string, append to the list.
keysdata = restofdata[:length]
keys = serialize_deserializedata(keysdata)
# The rest should be the values list.
values = serialize_deserializedata(restofdata[length:])
if type(keys) != list or type(values) != list or len(keys) != len(values):
raise ValueError("Malformed Dict string '"+restofstring+"'")
thisdict = {}
for position in xrange(len(keys)):
thisdict[keys[position]] = values[position]
return thisdict
# Unknown!!!
else:
raise ValueError("Unknown typeindicator '"+str(typeindicator)+"' for data :"+str(restofstring))
#end include serialize.repy
class bundle_InvalidOperationError(Exception):
''' Describes an invalid operation on a bundle. '''
class bundle_EncodingError(bundle_InvalidOperationError):
''' An error occurred during the encoding/decoding process. '''
# We use this to mark where our autoextraction script ends
_BUNDLE_AUTOEXTRACT_END_DELIMITER = "# End of auto-extraction script\n"
# We use these to mark where our section begins/ends
_BUNDLE_DATA_BEGIN_DELIMITER = "\n# Bundled data\n'''"
_BUNDLE_DATA_END_DELIMITER = "'''\n# End of bundled data\n"
# This is to allow bundles to auto-extract themselves.
# The include statement is 'inlined' with the previous line because
# we want to avoid the preprocessor from processing that line prior to
# the user running the preprocessor to compile their program.
_BUNDLE_AUTOEXTRACT_HEADER = """
# Start of auto-extraction script
bundle = bundle_Bundle('%s', 'r')
bundle.extract_all()
# Don't let user accidentally access this later
del bundle
""" + _BUNDLE_AUTOEXTRACT_END_DELIMITER
# This specifies the format that use to read the metadata pointer.
# This gives us enough space to specify up to 10^10 bits.
# This should be enough for most usages...
_BUNDLE_METADATA_WIDTH_LEN = 10
_BUNDLE_METADATA_WIDTH_FORMAT = "%" + str(_BUNDLE_METADATA_WIDTH_LEN) + "i"
class bundle_Bundle:
def __init__(self, fn, mode, srcfn = None):
"""
<Purpose>
Creates an bundle object currently in, or that will be created in the
current directory.
<Arguments>
fn: The name of the bundle.
mode: The read mode to open the bundle with.
r - Read-only
w - Write (Creates a new bundle)
a - Append (Use this to modify)
srcfn: If specified, the contents of this file will be embedded into
the bundle. This is only valid when creating a bundle with the
'w' flag.
<Side Effects>
Creates a new bundle, or opens an existing bundle for reading or
modification.
If the existing file is not a bundle, opening it in write or append mode
will convert it into a bundle.
<Exceptions>
The common exceptions associated with opening files in repy.
<Return>
The bundle object associated with the provided fn and mode.
"""
if mode not in ('w', 'r', 'a'):
raise ValueError("Invalid or unsupported bundle mode ('"+mode+"')");
if srcfn is None:
srcfn = fn
if mode == 'w':
# Copy contents over from sourcefile if needed
if srcfn != fn:
_bundle_copy_file(srcfn, fn)
self._name = fn
self._mode = mode
self._open(mode)
# Is this an existing bundle?
try:
self._fobj.seek(-len(_BUNDLE_DATA_END_DELIMITER), 2)
read_str = self._fobj.read(len(_BUNDLE_DATA_END_DELIMITER))
# We get an error if we try to seek to the left, when the file length is
# less than the amount we try to seek by.
except IOError, e:
if not "Invalid argument" in str(e):
raise
read_str = ""
existing_bundle = read_str == _BUNDLE_DATA_END_DELIMITER
if mode == 'w':
tempfilefn = 'tempdumpfile'
tempfile = open(tempfilefn, 'wb+')
if not existing_bundle:
# Make a copy of existing file contents
self._fobj.seek(0, 0)
_bundle_copy_file_contents(self._fobj, tempfile)
else:
# Find where the user's script is
data_position = _bundle_find_next_string_occurrence_in_file(self._fobj, _BUNDLE_AUTOEXTRACT_END_DELIMITER) + len(_BUNDLE_AUTOEXTRACT_END_DELIMITER)
self._fobj.seek(0, data_position)
# Find out how long the user's script is
bundle_data_position = _bundle_find_next_string_occurrence_in_file(self._fobj, _BUNDLE_AUTOEXTRACT_END_DELIMITER) + len(_BUNDLE_DATA_BEGIN_DELIMITER)
data_length = bundle_data_position - data_position
# Put the user's script into the temp file
_bundle_copy_file_contents(self._fobj, tempfile, data_length)
# Embed this script in the output script
self._fobj.seek(0, 0)
bundle_class_file = open('bundle.repy', 'rb')
# We must remove all occurrences of \r so that the string search mechanism
# doesn't have to deal with OS newline differences.
# This file really shouldn't have to use \r either way.
chunksize = 4096
data = bundle_class_file.read(chunksize).replace('\r', '')
while data:
self._fobj.write(data)
data = bundle_class_file.read(chunksize).replace('\r', '')
# Attach autoextraction script
self._fobj.write(_BUNDLE_AUTOEXTRACT_HEADER % self._name)
# Insert the user's script into the bundle
tempfile.seek(0, 0)
_bundle_copy_file_contents(tempfile, self._fobj)
# We are done with the temp file
tempfile.close()
removefile(tempfilefn)
# Append our data at the end of the file
self._fobj.seek(0, 2)
self._fobj.write(_BUNDLE_DATA_BEGIN_DELIMITER)
self._metadata_width = 0
self._metadata = {
'files': {},
'data_length': 0,
}
self._write_metadata()
else:
self._load_metadata()
def add_files(self, fns):
"""
<Purpose>
Adds the specified files into the bundle.
<Arguments>
fns:
The list of filenames of the files that should be added to the bundle.
<Side Effects>
The specified files will be locked for the duration of the write.
<Exceptions>
Throws a ValueError if fns is not a list.
Throws an InvalidOperationError if the file already exists in the bundle.
<Return>
A dictionary containing the files that failed to write, mapped to the
exceptions that they raised.
"""
if not isinstance(fns, list):
raise ValueError("fns must be a list")
for fn in fns:
if fn in self._metadata['files']:
raise bundle_InvalidOperationError("File already exists")
# Total amount of data added in this method
total_encoded_length = 0
# Amount of data to read at a time
chunksize = 4096
# The last file ends where the metadata begins
self._goto_bundle_position(0, -1)
for fn in fns:
# Start writing the file contents over
srcfile = open(fn, 'rb')
data = srcfile.read(chunksize)
file_encoded_length = 0;
while data:
encoded_data = _bundle_embed_encode(data)
self._fobj.write(encoded_data)
file_encoded_length += len(encoded_data)
data = srcfile.read(chunksize)
# Update the metadata
self._metadata['files'][fn] = {
'location': self._metadata['data_length'] + total_encoded_length,
'length': file_encoded_length
}
total_encoded_length += file_encoded_length
self._metadata['data_length'] += total_encoded_length
self._write_metadata()
def add_string(self, fn, data):
"""
<Purpose>
Enters the data into the bundle.
<Arguments>
fn: The filename to add under
data: The string data to write.
<Side Effects>
The bundle will have a new entry with the specified data.
<Exceptions>
None
<Returns>
None
"""
data = _bundle_embed_encode(data)
# The last file ends where the metadata begins
self._goto_bundle_position(0, -1)
self._fobj.write(data)
# Update the metadata
self._metadata['files'][fn] = {
'location': self._metadata['data_length'],
'length': len(data)
}
self._metadata['data_length'] += len(data)
self._write_metadata()
def add(self, fn):
"""
<Purpose>
Wrapper for add_files() that operates on 1 file.
For more information, see add_files().
<Arguments>
fn: The file to add.
"""
self.add_files([fn])
def remove(self, fn):
"""
<Purpose>
Wrapper for remove_files() that operates on 1 file.
For more information, see remove_files().
<Arguments>
fn: The file to remove.
"""
self.remove_files([fn])
def remove_files(self, fns_to_remove):
"""
<Purpose>
Removes the specified files from the bundle.
<Arguments>
fns:
The list of filenames of the files that should be removed from the
bundle.
<Side Effects>
Creates a temporary file to hold the bundle contents.
<Exceptions>
Throws a ValueError if fns is not a list.
Throws an InvalidOperationError if the file does not exist in the bundle.
<Return>
A dictionary containing the files that failed to be removed, mapped to
the exceptions that they raised.
"""
if not isinstance(fns_to_remove, list):
raise ValueError("fns_to_remove must be a list")
# Do all the files exist?
for fn in fns_to_remove:
if not fn in self._metadata['files']:
raise bundle_InvalidOperationError("File not Found: " + fn)
tempfn = "thetempfile"
tempfile = open(tempfn, 'wb+')
# Copy over everything right before the bundle begin
self._fobj.seek(0, 0)
bundle_data_position = _bundle_find_next_string_occurrence_in_file(self._fobj, _BUNDLE_DATA_BEGIN_DELIMITER)
self._fobj.seek(0, 0)
_bundle_copy_file_contents(self._fobj, tempfile, bundle_data_position)
# Mark the beginning of the bundled data
tempfile.write(_BUNDLE_DATA_BEGIN_DELIMITER)
# Remove the metadata entries that we don't need