apache-spark
Here are 900 public repositories matching this topic...
-
Updated
May 26, 2019 - Scala
According to the generated build
The commands to launch are the following :
docker pull andypetrella/spark-notebook:0.7.0-scala-2.11.8-spark-2.1.1-hadoop-2.7.2-with-hive
docker run -p 9001:9001 andypetrella/spark-notebook:0.7.0-scala-2.11.8-spark-2.1.1-hadoop-2.7.2-with-hive
Using that image (and I think it i
- 1. add docs to describe the model, explain the arguments (as well as how to configure in recipe) and best pratices. A good reference: https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/coxph.html
- 2. use the same parameter name in MTNetForecaster and AutoTS recipes.
- 3. in MTNetGridRandomRecipe, past_seq_len actually is conditioned on time_step and long_num,
https://gi
-
Updated
Mar 17, 2020 - Java
We create multiple jars during our builds to accommodate multiple versions of Apache Spark. In the current approach, the implementation is copied from one version to another and then necessary changes are made.
An ideal approach could create a common directory and extract common classes from duplicate code. Note that even if class/code is exactly the same, you cannot pull out to a common clas
-
Updated
Jun 26, 2020 - Dockerfile
Currently the documentation is in the form of a bunch of markdown files under the docs folder of the repo. It will be great to have a dedicated website for the project to host the documentation and announcement like releases.
-
Updated
Dec 3, 2019 - Python
-
Updated
Oct 24, 2017 - Python
Since we already consider #140 I guess we should look at the ML Flow as well. Definitely not now, but maybe when / if it gets the first stable release
CC @eliasah
The documentation file appears to have been generated with no space between the hashes and the header text. This is causing the headers to not display correctly, and is difficult to read. See below for an example of with and without the space:
##
Mobius API Documentation
###Microsoft.Spark.CSharp.Core.Accumulator</
Because some user has had problems configuring these services could be helpful to make some examples or videos about how to properly setup Optimus in this services.
The "components" returned from ml_pca() are NULL
# example
sc <- spark_connect(master = "local")
iris_tbl <- sdf_copy_to(sc, iris, name = "iris_tbl", overwrite = TRUE)
pca <- iris_tbl %>%
select(-Species) %>%
ml_pca()
pca$components
NULLR session information:
devtools::session_info()
Session info ------------------------------------------------------------
Updated
Jan 24, 2017 - Scala
Currently, geospark is based on the jts.STRTree. prestodb/presto#13079 could be a great enhancement regarding memory pressure, i.e. implementing it using Hilbert Packed RTree (Flatbush)
-
Updated
Mar 9, 2020 - Python
-
Updated
Jul 25, 2018 - Python
-
Updated
Jan 8, 2020 - Scala
If a cluster launch is interrupted before AWS can even return a list of instances, we hit a part of the code where cluster_instances is not defined. We should protect against that.
Additionally, when Flintrock comes across a broken cluster (e.g. missing tags) left behind by an interrup
-
Updated
May 24, 2020 - Java
-
Updated
Mar 31, 2018
While trying to write some tests of sparkle using tasty I found that it doesn't seem to work when bound threads other than the main one are used. The following program fails with:
$ stack --nix exec -- spark-submit --master 'local[1]' sparkle-example-osthreads.jar
16/12/19 10:30:29 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes
return exit codes
the curl in the book is curl -XGET 'localhost:9200/agile_data_science/airplanes/_search?q=*' which is incorrect.
"airplanes" should not be plural, but singular. Correct line is: curl -XGET 'localhost:9200/agile_data_science/airplane/_search?q=*'
-
Updated
Nov 16, 2019
-
Updated
Sep 14, 2015 - Shell
-
Updated
Dec 22, 2019
-
Updated
Jun 17, 2020 - Scala
2.12 support - docs
Improve this page
Add a description, image, and links to the apache-spark topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the apache-spark topic, visit your repo's landing page and select "manage topics."
Thank you for submitting an issue. Please refer to our issue policy
for information on what types of issues we address. For help with debugging your code, please refer to Stack Overflow.
Please fill in this template and do not delete it unless you are sure your issue is outs