Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update #1

Merged
merged 170 commits into from
Aug 6, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
170 commits
Select commit Hold shift + click to select a range
2b8d89e
[SPARK-2523] [SQL] Hadoop table scan bug fixing
chenghao-intel Jul 28, 2014
255b56f
[SPARK-2479][MLlib] Comparing floating-point numbers using relative e…
Jul 28, 2014
a7a9d14
[SPARK-2410][SQL] Merging Hive Thrift/JDBC server (with Maven profile…
liancheng Jul 28, 2014
39ab87b
Use commons-lang3 in SignalLogger rather than commons-lang
aarondav Jul 28, 2014
16ef4d1
Excess judgment
watermen Jul 29, 2014
ccd5ab5
[SPARK-2580] [PySpark] keep silent in worker if JVM close the socket
davies Jul 29, 2014
92ef026
[SPARK-791] [PySpark] fix pickle itemgetter with cloudpickle
davies Jul 29, 2014
96ba04b
[SPARK-2726] and [SPARK-2727] Remove SortOrder and do in-place sort.
rxin Jul 29, 2014
20424da
[SPARK-2174][MLLIB] treeReduce and treeAggregate
mengxr Jul 29, 2014
fc4d057
Minor indentation and comment typo fixes.
staple Jul 29, 2014
800ecff
[STREAMING] SPARK-1729. Make Flume pull data from source, rather than…
harishreedharan Jul 29, 2014
0c5c6a6
[SQL]change some test lists
adrian-wang Jul 29, 2014
e364348
[SPARK-2730][SQL] When retrieving a value from a Map, GetItem evaluat…
yhuai Jul 29, 2014
f0d880e
[SPARK-2674] [SQL] [PySpark] support datetime type for SchemaRDD
davies Jul 29, 2014
dc96536
[SPARK-2082] stratified sampling in PairRDDFunctions that guarantees …
dorx Jul 29, 2014
c7db274
[SPARK-2393][SQL] Cost estimation optimization framework for Catalyst…
concretevitamin Jul 29, 2014
2c35666
MAINTENANCE: Automated closing of pull requests.
pwendell Jul 30, 2014
39b8193
[SPARK-2716][SQL] Don't check resolved for having filters.
marmbrus Jul 30, 2014
86534d0
[SPARK-2631][SQL] Use SQLConf to configure in-memory columnar caching
marmbrus Jul 30, 2014
22649b6
[SPARK-2305] [PySpark] Update Py4J to version 0.8.2.1
JoshRosen Jul 30, 2014
8446746
[SPARK-2054][SQL] Code Generation for Expression Evaluation
marmbrus Jul 30, 2014
2e6efca
[SPARK-2568] RangePartitioner should run only one job if data is bala…
mengxr Jul 30, 2014
077f633
[SQL] Handle null values in debug()
marmbrus Jul 30, 2014
4ce92cc
[SPARK-2260] Fix standalone-cluster mode, which was broken
andrewor14 Jul 30, 2014
7003c16
[SPARK-2179][SQL] Public API for DataTypes and Schema
yhuai Jul 30, 2014
7c5fc28
SPARK-2543: Allow user to set maximum Kryo buffer size
koertkuipers Jul 30, 2014
ee07541
SPARK-2748 [MLLIB] [GRAPHX] Loss of precision for small arguments to …
srowen Jul 30, 2014
774142f
[SPARK-2521] Broadcast RDD object (instead of sending it along with e…
rxin Jul 30, 2014
3bc3f18
[SPARK-2747] git diff --dirstat can miss sql changes and not run Hive…
rxin Jul 30, 2014
e3d85b7
Avoid numerical instability
naftaliharris Jul 30, 2014
fc47bb6
[SPARK-2544][MLLIB] Improve ALS algorithm resource usage
witgo Jul 30, 2014
ff511ba
[SPARK-2746] Set SBT_MAVEN_PROFILES only when it is not set explicitl…
rxin Jul 30, 2014
f2eb84f
Wrap FWDIR in quotes.
rxin Jul 30, 2014
95cf203
Wrap FWDIR in quotes in dev/check-license.
rxin Jul 30, 2014
0feb349
More wrapping FWDIR in quotes.
rxin Jul 30, 2014
2248891
[SQL] Fix compiling of catalyst docs.
marmbrus Jul 30, 2014
437dc8c
dev/check-license wrap folders in quotes.
rxin Jul 30, 2014
94d1f46
[SPARK-2024] Add saveAsSequenceFile to PySpark
kanzhang Jul 30, 2014
7c7ce54
Wrap JAR_DL in dev/check-license.
rxin Jul 30, 2014
1097327
Set AMPLAB_JENKINS_BUILD_PROFILE.
rxin Jul 30, 2014
2f4b170
Properly pass SBT_MAVEN_PROFILES into sbt.
rxin Jul 30, 2014
6ab96a6
SPARK-2749 [BUILD]. Spark SQL Java tests aren't compiling in Jenkins'…
srowen Jul 30, 2014
2ac37db
SPARK-2741 - Publish version of spark assembly which does not contain…
Jul 31, 2014
88a519d
[SPARK-2734][SQL] Remove tables from cache when DROP TABLE is run.
marmbrus Jul 31, 2014
e9b275b
SPARK-2341 [MLLIB] loadLibSVMFile doesn't handle regression datasets
srowen Jul 31, 2014
da50176
Update DecisionTreeRunner.scala
strat0sphere Jul 31, 2014
e966284
SPARK-2045 Sort-based shuffle
mateiz Jul 31, 2014
894d48f
[SPARK-2758] UnionRDD's UnionPartition should not reference parent RDDs
rxin Jul 31, 2014
118c1c4
Required AM memory is "amMem", not "args.amMemory"
maji2014 Jul 31, 2014
a7c305b
[SPARK-2340] Resolve event logging and History Server paths properly
andrewor14 Jul 31, 2014
4fb2593
[SPARK-2737] Add retag() method for changing RDDs' ClassTags.
JoshRosen Jul 31, 2014
5a110da
[SPARK-2497] Included checks for module symbols too.
ScrapCodes Jul 31, 2014
669e3f0
automatically set master according to `spark.master` in `spark-defaul…
CrazyJvm Jul 31, 2014
92ca910
[SPARK-2762] SparkILoop leaks memory in multi-repl configurations
thunterdb Jul 31, 2014
3072b96
[SPARK-2743][SQL] Resolve original attributes in ParquetTableScan
marmbrus Jul 31, 2014
72cfb13
[SPARK-2397][SQL] Deprecate LocalHiveContext
marmbrus Jul 31, 2014
f193312
SPARK-2028: Expose mapPartitionsWithInputSplit in HadoopRDD
aarondav Jul 31, 2014
f68105d
SPARK-2664. Deal with `--conf` options in spark-submit that relate to…
sryza Jul 31, 2014
4dbabb3
SPARK-2749 [BUILD] Part 2. Fix a follow-on scalastyle error
srowen Jul 31, 2014
e5749a1
SPARK-2646. log4j initialization not quite compatible with log4j 2.x
srowen Jul 31, 2014
dc0865b
[SPARK-2511][MLLIB] add HashingTF and IDF
mengxr Jul 31, 2014
49b3612
[SPARK-2523] [SQL] Hadoop table scan bug fixing (fix failing Jenkins …
yhuai Jul 31, 2014
e021362
Improvements to merge_spark_pr.py
JoshRosen Jul 31, 2014
cc82050
Docs: monitoring, streaming programming guide
kennyballou Jul 31, 2014
492a195
SPARK-2740: allow user to specify ascending and numPartitions for sor…
Jul 31, 2014
ef4ff00
SPARK-2282: Reuse Socket for sending accumulator updates to Pyspark
aarondav Jul 31, 2014
8f51491
[SPARK-2531 & SPARK-2436] [SQL] Optimize the BuildSide when planning …
concretevitamin Aug 1, 2014
d843014
[SPARK-2724] Python version of RandomRDDGenerators
dorx Aug 1, 2014
b124de5
[SPARK-2756] [mllib] Decision tree bug fixes
jkbradley Aug 1, 2014
9632719
[SPARK-2779] [SQL] asInstanceOf[Map[...]] should use scala.collection…
yhuai Aug 1, 2014
9998efa
SPARK-2766: ScalaReflectionSuite throw an llegalArgumentException i…
witgo Aug 1, 2014
b190083
[SPARK-2777][MLLIB] change ALS factors storage level to MEMORY_AND_DISK
mengxr Aug 1, 2014
c475540
[SPARK-2782][mllib] Bug fix for getRanks in SpearmanCorrelation
dorx Aug 1, 2014
2cdc3e5
[SPARK-2702][Core] Upgrade Tachyon dependency to 0.5.0
haoyuan Aug 1, 2014
1499101
SPARK-2632, SPARK-2576. Fixed by only importing what is necessary dur…
ScrapCodes Aug 1, 2014
cb9e7d5
SPARK-2738. Remove redundant imports in BlockManagerSuite
sryza Aug 1, 2014
8ff4417
[SPARK-2670] FetchFailedException should be thrown when local fetch h…
sarutak Aug 1, 2014
72e3369
SPARK-983. Support external sorting in sortByKey()
mateiz Aug 1, 2014
f1957e1
SPARK-2134: Report metrics before application finishes
Aug 1, 2014
284771e
[Spark 2557] fix LOCAL_N_REGEX in createTaskScheduler and make local-…
advancedxy Aug 1, 2014
a32f0fb
[SPARK-2103][Streaming] Change to ClassTag for KafkaInputDStream and …
jerryshao Aug 1, 2014
82d209d
SPARK-2768 [MLLIB] Add product, user recommend method to MatrixFactor…
srowen Aug 1, 2014
0dacb1a
[SPARK-1997] update breeze to version 0.8.1
witgo Aug 1, 2014
5328c0a
[HOTFIX] downgrade breeze version to 0.7
mengxr Aug 1, 2014
8d338f6
SPARK-2099. Report progress while task is running.
sryza Aug 1, 2014
c41fdf0
[SPARK-2179][SQL] A minor refactoring Java data type APIs (2179 follo…
yhuai Aug 1, 2014
4415722
[SQL][SPARK-2212]Hash Outer Join
chenghao-intel Aug 1, 2014
580c701
[SPARK-2729] [SQL] Forgot to match Timestamp type in ColumnBuilder
chutium Aug 1, 2014
c0b47ba
[SPARK-2767] [SQL] SparkSQL CLI doens't output error message if query…
chenghao-intel Aug 1, 2014
c82fe47
[SQL] Documentation: Explain cacheTable command
CrazyJvm Aug 1, 2014
eb5bdca
[SPARK-695] In DAGScheduler's getPreferredLocs, track set of visited …
staple Aug 1, 2014
baf9ce1
[SPARK-2490] Change recursive visiting on RDD dependencies to iterati…
viirya Aug 1, 2014
f5d9bea
SPARK-1612: Fix potential resource leaks
zsxwing Aug 1, 2014
b270309
[SPARK-2379] Fix the bug that streaming's receiver may fall into a de…
joyyoj Aug 1, 2014
78f2af5
SPARK-2791: Fix committing, reverting and state tracking in shuffle f…
aarondav Aug 1, 2014
d88e695
[SPARK-2786][mllib] Python correlations
dorx Aug 1, 2014
7058a53
[SPARK-2796] [mllib] DecisionTree bug fix: ordered categorical features
jkbradley Aug 1, 2014
880eabe
[SPARK-2010] [PySpark] [SQL] support nested structure in SchemaRDD
davies Aug 2, 2014
3822f33
[SPARK-2212][SQL] Hash Outer Join (follow-up bug fix).
yhuai Aug 2, 2014
0da07da
[SPARK-2116] Load spark-defaults.conf from SPARK_CONF_DIR if set
chu11 Aug 2, 2014
a38d3c9
[SPARK-2800]: Exclude scalastyle-output.xml Apache RAT checks
witgo Aug 2, 2014
e8e0fd6
[SPARK-2764] Simplify daemon.py process structure
JoshRosen Aug 2, 2014
f6a1899
Streaming mllib [SPARK-2438][MLLIB]
freeman-lab Aug 2, 2014
c281189
[SPARK-2550][MLLIB][APACHE SPARK] Support regularization and intercep…
miccagiann Aug 2, 2014
e25ec06
[SPARK-1580][MLLIB] Estimate ALS communication and computation costs.
tmyklebu Aug 2, 2014
fda4759
[SPARK-2801][MLlib]: DistributionGenerator renamed to RandomDataGener…
brkyvz Aug 2, 2014
4bc3bb2
StatCounter on NumPy arrays [PYSPARK][SPARK-2012]
freeman-lab Aug 2, 2014
adc8303
[SPARK-1470][SPARK-1842] Use the scala-logging wrapper instead of the…
witgo Aug 2, 2014
dab3796
Revert "[SPARK-1470][SPARK-1842] Use the scala-logging wrapper instea…
pwendell Aug 2, 2014
d934801
[SPARK-2316] Avoid O(blocks) operations in listeners
andrewor14 Aug 2, 2014
148af60
[SPARK-2454] Do not ship spark home to Workers
andrewor14 Aug 2, 2014
08c095b
[SPARK-1812] sql/catalyst - Provide explicit type information
avati Aug 2, 2014
25cad6a
HOTFIX: Fixing test error in maven for flume-sink.
pwendell Aug 2, 2014
44460ba
HOTFIX: Fix concurrency issue in FlumePollingStreamSuite.
pwendell Aug 2, 2014
87738bf
MAINTENANCE: Automated closing of pull requests.
pwendell Aug 2, 2014
e09e18b
[HOTFIX] Do not throw NPE if spark.test.home is not set
andrewor14 Aug 2, 2014
3f67382
[SPARK-2478] [mllib] DecisionTree Python API
jkbradley Aug 2, 2014
67bd8e3
[SQL] Set outputPartitioning of BroadcastHashJoin correctly.
yhuai Aug 2, 2014
91f9504
[SPARK-1981] Add AWS Kinesis streaming support
cfregly Aug 2, 2014
4c47711
SPARK-2804: Remove scalalogging-slf4j dependency
witgo Aug 2, 2014
158ad0b
[SPARK-2097][SQL] UDF Support
marmbrus Aug 2, 2014
198df11
[SPARK-2785][SQL] Remove assertions that throw when users try unsuppo…
marmbrus Aug 2, 2014
866cf1f
[SPARK-2729][SQL] Added test case for SPARK-2729
liancheng Aug 3, 2014
d210022
[SPARK-2797] [SQL] SchemaRDDs don't support unpersist()
yhuai Aug 3, 2014
1a80437
[SPARK-2739][SQL] Rename registerAsTable to registerTempTable
marmbrus Aug 3, 2014
33f167d
SPARK-2602 [BUILD] Tests steal focus under Java 6
srowen Aug 3, 2014
9cf429a
SPARK-2414 [BUILD] Add LICENSE entry for jquery
srowen Aug 3, 2014
3dc55fd
[Minor] Fixes on top of #1679
andrewor14 Aug 3, 2014
f8cd143
SPARK-2712 - Add a small note to maven doc that mvn package must happ…
javadba Aug 3, 2014
a0bcbc1
SPARK-2246: Add user-data option to EC2 scripts
Aug 3, 2014
2998e38
[SPARK-2197] [mllib] Java DecisionTree bug fix and easy-of-use
jkbradley Aug 3, 2014
236dfac
[SPARK-2784][SQL] Deprecate hql() method in favor of a config option,…
marmbrus Aug 3, 2014
ac33cbb
[SPARK-2814][SQL] HiveThriftServer2 throws NPE when executing native …
liancheng Aug 3, 2014
e139e2b
[SPARK-2783][SQL] Basic support for analyze in HiveContext
yhuai Aug 3, 2014
55349f9
[SPARK-1740] [PySpark] kill the python worker
davies Aug 3, 2014
6ba6c3e
[SPARK-2810] upgrade to scala-maven-plugin 3.2.0
avati Aug 4, 2014
5507dd8
Fix some bugs with spaces in directory name.
sarahgerweck Aug 4, 2014
ae58aea
SPARK-2272 [MLlib] Feature scaling which standardizes the range of in…
Aug 4, 2014
e053c55
[MLlib] [SPARK-2510]Word2Vec: Distributed Representation of Words
Aug 4, 2014
59f84a9
[SPARK-1687] [PySpark] pickable namedtuple
davies Aug 4, 2014
8e7d5ba
SPARK-2792. Fix reading too much or too little data from each stream …
mateiz Aug 4, 2014
9fd82db
[SPARK-1687] [PySpark] fix unit tests related to pickable namedtuple
davies Aug 4, 2014
05bf4e4
[SPARK-2323] Exception in accumulator update should not crash DAGSche…
rxin Aug 5, 2014
066765d
SPARK-2685. Update ExternalAppendOnlyMap to avoid buffer.remove()
mateiz Aug 5, 2014
4fde28c
SPARK-2711. Create a ShuffleMemoryManager to track memory for all spi…
mateiz Aug 5, 2014
a646a36
[SPARK-2857] Correct properties to set Master / Worker ports
andrewor14 Aug 5, 2014
9862c61
[SPARK-1779] Throw an exception if memory fractions are not between 0…
Aug 5, 2014
184048f
[SPARK-2856] Decrease initial buffer size for Kryo to 64KB.
rxin Aug 5, 2014
e87075d
[SPARK-1022][Streaming] Add Kafka real unit test
jerryshao Aug 5, 2014
2c0f705
SPARK-1528 - spark on yarn, add support for accessing remote HDFS
tgravescs Aug 5, 2014
1c5555a
SPARK-1890 and SPARK-1891- add admin and modify acls
tgravescs Aug 5, 2014
6e821e3
[SPARK-2860][SQL] Fix coercion of CASE WHEN.
marmbrus Aug 5, 2014
ac3440f
[SPARK-2859] Update url of Kryo project in related docs
gchen Aug 5, 2014
74f82c7
SPARK-2380: Support displaying accumulator values in the web UI
pwendell Aug 5, 2014
41e0a21
SPARK-1680: use configs for specifying environment variables on YARN
tgravescs Aug 5, 2014
cc491f6
[SPARK-2864][MLLIB] fix random seed in word2vec; move model to local
mengxr Aug 5, 2014
acff9a7
[SPARK-2503] Lower shuffle output buffer (spark.shuffle.file.buffer.k…
rxin Aug 5, 2014
1aad911
[SPARK-2550][MLLIB][APACHE SPARK] Support regularization and intercep…
miccagiann Aug 5, 2014
2643e66
SPARK-2869 - Fix tiny bug in JdbcRdd for closing jdbc connection
Aug 6, 2014
d94f599
[sql] rename project name in pom.xml of hive-thriftserver module
scwf Aug 6, 2014
d0ae3f3
[SPARK-2650][SQL] Try to partially fix SPARK-2650 by adjusting initia…
liancheng Aug 6, 2014
69ec678
[SPARK-2854][SQL] Finalize _acceptable_types in pyspark.sql
yhuai Aug 6, 2014
1d70c4f
[SPARK-2866][SQL] Support attributes in ORDER BY that aren't in SELECT
marmbrus Aug 6, 2014
82624e2
[SPARK-2806] core - upgrade to json4s-jackson 3.2.10
avati Aug 6, 2014
b70bae4
[SQL] Tighten the visibility of various SQLConf methods and renamed s…
rxin Aug 6, 2014
5a826c0
[SQL] Fix logging warn -> debug
marmbrus Aug 6, 2014
63bdb1f
SPARK-2294: fix locality inversion bug in TaskManager
CodingCat Aug 6, 2014
c7b5201
[MLlib] Use this.type as return type in k-means' builder pattern
Aug 6, 2014
ee7f308
[SPARK-1022][Streaming][HOTFIX] Fixed zookeeper dependency of Kafka
tdas Aug 6, 2014
09f7e45
[SPARK-2157] Enable tight firewall rules for Spark
andrewor14 Aug 6, 2014
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,4 @@ metastore_db/
metastore/
warehouse/
TempStatsStore/
sql/hive-thriftserver/test_warehouses
1 change: 1 addition & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,4 @@ dist/*
.*ipr
.*iws
logs
.*scalastyle-output.xml
5 changes: 3 additions & 2 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -272,7 +272,7 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


========================================================================
For Py4J (python/lib/py4j0.7.egg and files in assembly/lib/net/sf/py4j):
For Py4J (python/lib/py4j-0.8.2.1-src.zip)
========================================================================

Copyright (c) 2009-2011, Barthelemy Dagenais All rights reserved.
Expand Down Expand Up @@ -532,7 +532,7 @@ The following components are provided under a BSD-style license. See project lin
(New BSD license) Protocol Buffer Java API (org.spark-project.protobuf:protobuf-java:2.4.1-shaded - http://code.google.com/p/protobuf)
(The BSD License) Fortran to Java ARPACK (net.sourceforge.f2j:arpack_combined_all:0.1 - http://f2j.sourceforge.net)
(The BSD License) xmlenc Library (xmlenc:xmlenc:0.52 - http://xmlenc.sourceforge.net)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.8.1 - http://py4j.sourceforge.net/)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.8.2.1 - http://py4j.sourceforge.net/)
(Two-clause BSD-style license) JUnit-Interface (com.novocode:junit-interface:0.10 - http://github.com/szeiger/junit-interface/)
(ISC/BSD License) jbcrypt (org.mindrot:jbcrypt:0.3m - http://www.mindrot.org/)

Expand All @@ -549,3 +549,4 @@ The following components are provided under the MIT License. See project link fo
(MIT License) pyrolite (org.spark-project:pyrolite:2.0.1 - http://pythonhosted.org/Pyro4/)
(MIT License) scopt (com.github.scopt:scopt_2.10:3.2.0 - https://github.com/scopt/scopt)
(The MIT License) Mockito (org.mockito:mockito-all:1.8.5 - http://www.mockito.org)
(MIT License) jquery (https://jquery.org/license/)
10 changes: 10 additions & 0 deletions assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,16 @@
</dependency>
</dependencies>
</profile>
<profile>
<id>hive-thriftserver</id>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive-thriftserver_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</profile>
<profile>
<id>spark-ganglia-lgpl</id>
<dependencies>
Expand Down
2 changes: 1 addition & 1 deletion bagel/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
<groupId>org.apache.spark</groupId>
<artifactId>spark-bagel_2.10</artifactId>
<properties>
<sbt.project.name>bagel</sbt.project.name>
<sbt.project.name>bagel</sbt.project.name>
</properties>
<packaging>jar</packaging>
<name>Spark Project Bagel</name>
Expand Down
45 changes: 45 additions & 0 deletions bin/beeline
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/usr/bin/env bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Figure out where Spark is installed
FWDIR="$(cd `dirname $0`/..; pwd)"

# Find the java binary
if [ -n "${JAVA_HOME}" ]; then
RUNNER="${JAVA_HOME}/bin/java"
else
if [ `command -v java` ]; then
RUNNER="java"
else
echo "JAVA_HOME is not set" >&2
exit 1
fi
fi

# Compute classpath using external script
classpath_output=$($FWDIR/bin/compute-classpath.sh)
if [[ "$?" != "0" ]]; then
echo "$classpath_output"
exit 1
else
CLASSPATH=$classpath_output
fi

CLASS="org.apache.hive.beeline.BeeLine"
exec "$RUNNER" -cp "$CLASSPATH" $CLASS "$@"
1 change: 1 addition & 0 deletions bin/compute-classpath.sh
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ if [ -n "$SPARK_PREPEND_CLASSES" ]; then
CLASSPATH="$CLASSPATH:$FWDIR/sql/catalyst/target/scala-$SCALA_VERSION/classes"
CLASSPATH="$CLASSPATH:$FWDIR/sql/core/target/scala-$SCALA_VERSION/classes"
CLASSPATH="$CLASSPATH:$FWDIR/sql/hive/target/scala-$SCALA_VERSION/classes"
CLASSPATH="$CLASSPATH:$FWDIR/sql/hive-thriftserver/target/scala-$SCALA_VERSION/classes"
CLASSPATH="$CLASSPATH:$FWDIR/yarn/stable/target/scala-$SCALA_VERSION/classes"
fi

Expand Down
2 changes: 1 addition & 1 deletion bin/pyspark
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ export PYSPARK_PYTHON

# Add the PySpark classes to the Python path:
export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.8.1-src.zip:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.8.2.1-src.zip:$PYTHONPATH

# Load the PySpark shell.py script when ./pyspark is used interactively:
export OLD_PYTHONSTARTUP=$PYTHONSTARTUP
Expand Down
2 changes: 1 addition & 1 deletion bin/pyspark2.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ rem Figure out which Python to use.
if [%PYSPARK_PYTHON%] == [] set PYSPARK_PYTHON=python

set PYTHONPATH=%FWDIR%python;%PYTHONPATH%
set PYTHONPATH=%FWDIR%python\lib\py4j-0.8.1-src.zip;%PYTHONPATH%
set PYTHONPATH=%FWDIR%python\lib\py4j-0.8.2.1-src.zip;%PYTHONPATH%

set OLD_PYTHONSTARTUP=%PYTHONSTARTUP%
set PYTHONSTARTUP=%FWDIR%python\pyspark\shell.py
Expand Down
3 changes: 2 additions & 1 deletion bin/run-example
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@ if [ -n "$1" ]; then
else
echo "Usage: ./bin/run-example <example-class> [example-args]" 1>&2
echo " - set MASTER=XX to use a specific master" 1>&2
echo " - can use abbreviated example class name (e.g. SparkPi, mllib.LinearRegression)" 1>&2
echo " - can use abbreviated example class name relative to com.apache.spark.examples" 1>&2
echo " (e.g. SparkPi, mllib.LinearRegression, streaming.KinesisWordCountASL)" 1>&2
exit 1
fi

Expand Down
3 changes: 2 additions & 1 deletion bin/run-example2.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ rem Test that an argument was given
if not "x%1"=="x" goto arg_given
echo Usage: run-example ^<example-class^> [example-args]
echo - set MASTER=XX to use a specific master
echo - can use abbreviated example class name (e.g. SparkPi, mllib.LinearRegression)
echo - can use abbreviated example class name relative to com.apache.spark.examples
echo (e.g. SparkPi, mllib.LinearRegression, streaming.KinesisWordCountASL)
goto exit
:arg_given

Expand Down
4 changes: 2 additions & 2 deletions bin/spark-shell
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,11 @@ function main(){
# (see https://github.com/sbt/sbt/issues/562).
stty -icanon min 1 -echo > /dev/null 2>&1
export SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS -Djline.terminal=unix"
$FWDIR/bin/spark-submit spark-shell "$@" --class org.apache.spark.repl.Main
$FWDIR/bin/spark-submit --class org.apache.spark.repl.Main spark-shell "$@"
stty icanon echo > /dev/null 2>&1
else
export SPARK_SUBMIT_OPTS
$FWDIR/bin/spark-submit spark-shell "$@" --class org.apache.spark.repl.Main
$FWDIR/bin/spark-submit --class org.apache.spark.repl.Main spark-shell "$@"
fi
}

Expand Down
2 changes: 1 addition & 1 deletion bin/spark-shell.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,4 @@ rem

set SPARK_HOME=%~dp0..

cmd /V /E /C %SPARK_HOME%\bin\spark-submit.cmd spark-shell %* --class org.apache.spark.repl.Main
cmd /V /E /C %SPARK_HOME%\bin\spark-submit.cmd spark-shell --class org.apache.spark.repl.Main %*
36 changes: 36 additions & 0 deletions bin/spark-sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
#!/usr/bin/env bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

#
# Shell script for starting the Spark SQL CLI

# Enter posix mode for bash
set -o posix

# Figure out where Spark is installed
FWDIR="$(cd `dirname $0`/..; pwd)"

if [[ "$@" = *--help ]] || [[ "$@" = *-h ]]; then
echo "Usage: ./sbin/spark-sql [options]"
$FWDIR/bin/spark-submit --help 2>&1 | grep -v Usage 1>&2
exit 0
fi

CLASS="org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver"
exec "$FWDIR"/bin/spark-submit --class $CLASS spark-internal $@
15 changes: 10 additions & 5 deletions core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<properties>
<sbt.project.name>core</sbt.project.name>
<sbt.project.name>core</sbt.project.name>
</properties>
<packaging>jar</packaging>
<name>Spark Project Core</name>
Expand Down Expand Up @@ -150,7 +150,7 @@
<dependency>
<groupId>org.json4s</groupId>
<artifactId>json4s-jackson_${scala.binary.version}</artifactId>
<version>3.2.6</version>
<version>3.2.10</version>
</dependency>
<dependency>
<groupId>colt</groupId>
Expand Down Expand Up @@ -192,8 +192,8 @@
</dependency>
<dependency>
<groupId>org.tachyonproject</groupId>
<artifactId>tachyon</artifactId>
<version>0.4.1-thrift</version>
<artifactId>tachyon-client</artifactId>
<version>0.5.0</version>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
Expand Down Expand Up @@ -262,6 +262,11 @@
<artifactId>asm</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.novocode</groupId>
<artifactId>junit-interface</artifactId>
Expand All @@ -275,7 +280,7 @@
<dependency>
<groupId>net.sf.py4j</groupId>
<artifactId>py4j</artifactId>
<version>0.8.1</version>
<version>0.8.2.1</version>
</dependency>
</dependencies>
<build>
Expand Down
19 changes: 15 additions & 4 deletions core/src/main/scala/org/apache/spark/Accumulators.scala
Original file line number Diff line number Diff line change
Expand Up @@ -36,15 +36,21 @@ import org.apache.spark.serializer.JavaSerializer
*
* @param initialValue initial value of accumulator
* @param param helper object defining how to add elements of type `R` and `T`
* @param name human-readable name for use in Spark's web UI
* @tparam R the full accumulated data (result type)
* @tparam T partial data that can be added in
*/
class Accumulable[R, T] (
@transient initialValue: R,
param: AccumulableParam[R, T])
param: AccumulableParam[R, T],
val name: Option[String])
extends Serializable {

val id = Accumulators.newId
def this(@transient initialValue: R, param: AccumulableParam[R, T]) =
this(initialValue, param, None)

val id: Long = Accumulators.newId

@transient private var value_ = initialValue // Current value on master
val zero = param.zero(initialValue) // Zero value to be passed to workers
private var deserialized = false
Expand Down Expand Up @@ -219,8 +225,10 @@ GrowableAccumulableParam[R <% Growable[T] with TraversableOnce[T] with Serializa
* @param param helper object defining how to add elements of type `T`
* @tparam T result type
*/
class Accumulator[T](@transient initialValue: T, param: AccumulatorParam[T])
extends Accumulable[T,T](initialValue, param)
class Accumulator[T](@transient initialValue: T, param: AccumulatorParam[T], name: Option[String])
extends Accumulable[T,T](initialValue, param, name) {
def this(initialValue: T, param: AccumulatorParam[T]) = this(initialValue, param, None)
}

/**
* A simpler version of [[org.apache.spark.AccumulableParam]] where the only data type you can add
Expand Down Expand Up @@ -281,4 +289,7 @@ private object Accumulators {
}
}
}

def stringifyPartialValue(partialValue: Any) = "%s".format(partialValue)
def stringifyValue(value: Any) = "%s".format(value)
}
24 changes: 16 additions & 8 deletions core/src/main/scala/org/apache/spark/Aggregator.scala
Original file line number Diff line number Diff line change
Expand Up @@ -56,18 +56,23 @@ case class Aggregator[K, V, C] (
} else {
val combiners = new ExternalAppendOnlyMap[K, V, C](createCombiner, mergeValue, mergeCombiners)
combiners.insertAll(iter)
// TODO: Make this non optional in a future release
Option(context).foreach(c => c.taskMetrics.memoryBytesSpilled = combiners.memoryBytesSpilled)
Option(context).foreach(c => c.taskMetrics.diskBytesSpilled = combiners.diskBytesSpilled)
// Update task metrics if context is not null
// TODO: Make context non optional in a future release
Option(context).foreach { c =>
c.taskMetrics.memoryBytesSpilled += combiners.memoryBytesSpilled
c.taskMetrics.diskBytesSpilled += combiners.diskBytesSpilled
}
combiners.iterator
}
}

@deprecated("use combineCombinersByKey with TaskContext argument", "0.9.0")
def combineCombinersByKey(iter: Iterator[(K, C)]) : Iterator[(K, C)] =
def combineCombinersByKey(iter: Iterator[_ <: Product2[K, C]]) : Iterator[(K, C)] =
combineCombinersByKey(iter, null)

def combineCombinersByKey(iter: Iterator[(K, C)], context: TaskContext) : Iterator[(K, C)] = {
def combineCombinersByKey(iter: Iterator[_ <: Product2[K, C]], context: TaskContext)
: Iterator[(K, C)] =
{
if (!externalSorting) {
val combiners = new AppendOnlyMap[K,C]
var kc: Product2[K, C] = null
Expand All @@ -85,9 +90,12 @@ case class Aggregator[K, V, C] (
val pair = iter.next()
combiners.insert(pair._1, pair._2)
}
// TODO: Make this non optional in a future release
Option(context).foreach(c => c.taskMetrics.memoryBytesSpilled = combiners.memoryBytesSpilled)
Option(context).foreach(c => c.taskMetrics.diskBytesSpilled = combiners.diskBytesSpilled)
// Update task metrics if context is not null
// TODO: Make context non-optional in a future release
Option(context).foreach { c =>
c.taskMetrics.memoryBytesSpilled += combiners.memoryBytesSpilled
c.taskMetrics.diskBytesSpilled += combiners.diskBytesSpilled
}
combiners.iterator
}
}
Expand Down
Loading