git commit: [SPARK-2013] Documentation for saveAsPickleFile and pickleFile in Python

rxin Sat, 14 Jun 2014 13:38:26 -0700

Repository: spark
Updated Branches:
  refs/heads/branch-1.0 b1a7e99fe -> 05d85c86e



[SPARK-2013] Documentation for saveAsPickleFile and pickleFile in Python

Author: Kan Zhang <[email protected]>

Closes #983 from kanzhang/SPARK-2013 and squashes the following commits:

0e128bb [Kan Zhang] [SPARK-2013] minor update
e728516 [Kan Zhang] [SPARK-2013] Documentation for saveAsPickleFile and 
pickleFile in Python

(cherry picked from commit b52603b039cdfa0f8e58ef3c6229d79e732ffc58)
Signed-off-by: Reynold Xin <[email protected]>

Conflicts:
        docs/programming-guide.md


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/05d85c86
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/05d85c86
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/05d85c86

Branch: refs/heads/branch-1.0
Commit: 05d85c86ecbf75f7bb13efcf24b3af4e9e3ef612
Parents: b1a7e99
Author: Kan Zhang <[email protected]>
Authored: Sat Jun 14 13:22:30 2014 -0700
Committer: Reynold Xin <[email protected]>
Committed: Sat Jun 14 13:36:21 2014 -0700

----------------------------------------------------------------------
 docs/programming-guide.md | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/05d85c86/docs/programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index 7d77e64..b667aa0 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -379,10 +379,12 @@ Some notes on reading files with Spark:
 * The `textFile` method also takes an optional second argument for controlling 
the number of slices of the file. By default, Spark creates one slice for each 
block of the file (blocks being 64MB by default in HDFS), but you can also ask 
for a higher number of slices by passing a larger value. Note that you cannot 
have fewer slices than blocks.
 
 Apart reading files as a collection of lines,
-`SparkContext.wholeTextFiles` lets you read a directory containing multiple 
small text files, and returns each of them as (filename, content) pairs. This 
is in contrast with `textFile`, which would return one record per line in each 
file.
 
-</div>
+* `SparkContext.wholeTextFiles` lets you read a directory containing multiple 
small text files, and returns each of them as (filename, content) pairs. This 
is in contrast with `textFile`, which would return one record per line in each 
file.
 
+* `RDD.saveAsPickleFile` and `SparkContext.pickleFile` support saving an RDD 
in a simple format consisting of pickled Python objects. Batching is used on 
pickle serialization, with default batch size 10.
+
+</div>
 
 </div>

git commit: [SPARK-2013] Documentation for saveAsPickleFile and pickleFile in Python

Reply via email to