-
Updated
Jun 24, 2021 - Go
#
hdfs
Here are 650 public repositories matching this topic...
SeaweedFS is a distributed storage system for blobs, objects, files, and data lake, to store and serve billions of files fast! Blob store has O(1) disk seek, local tiering, cloud tiering. Filer supports cross-cluster active-active replication, Kubernetes, POSIX, S3 API, encryption, Erasure Coding for warm storage, FUSE mount, Hadoop, WebDAV.
kubernetes
distributed-systems
fuse
replication
s3
posix
s3-storage
hdfs
distributed-storage
distributed-file-system
erasure-coding
object-storage
blob-storage
seaweedfs
hadoop-hdfs
tiered-file-system
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
-
Updated
Jun 25, 2021 - Go
jcmincke
commented
Jun 21, 2021
import string
my_project = 'pj-becfr-eagle-ci-dev'
my_dataset = 'dm_express_assortment_V1'
import ibis
import ibis_bigquery
from google.cloud import bigquery
import pandas as pd
my_table = 'issue_3_table'
pdf = pd.DataFrame({'a': [1], 'b': [2], 'c': [3]})
# Load client
client = bigquery.Client(project=my_project)
# Load data to BQ
client.delete_table(my_dataset + "." +
The Universal Storage Engine
data-science
storage-engine
s3
sparse-data
scientific-computing
s3-storage
arrays
hdfs
data-analysis
dataframes
tiledb
dense-data
sparse-arrays
-
Updated
Jun 24, 2021 - C++
bbaja42
commented
Dec 18, 2018
Similar to how unix ls works, param could be -t
1
1
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
python
linux
docker
aws
elasticsearch
devops
json
cloudformation
spark
hadoop
avro
travis-ci
solr
gcp
hbase
pyspark
hdfs
parquet
dockerhub
gcf
-
Updated
Jun 8, 2021 - Python
Web tool for Kafka Connect |
-
Updated
Dec 30, 2020 - JavaScript
Kafka Connect HDFS connector
-
Updated
Jun 22, 2021 - Java
Divolte Collector
-
Updated
Jun 23, 2021 - Java
Open
Migrate to goavro v2
efirs
opened
Oct 9, 2019
Fundamentals of Spark with Python (using PySpark), code examples
python
machine-learning
sql
database
big-data
spark
apache-spark
hadoop
analytics
parallel-computing
distributed-computing
apache
map-reduce
pyspark
hdfs
dataframe
mlib
-
Updated
Jul 7, 2020 - Jupyter Notebook
DC/OS SDK is a collection of tools, libraries, and documentation for easy integration of technologies such as Kafka, Cassandra, HDFS, Spark, and TensorFlow with DC/OS.
kubernetes
elasticsearch
kafka
cassandra
tensorflow
declarative
mesos
dcos
hdfs
stateful-containers
dcos-data-services-guild
-
Updated
Jun 6, 2021 - Java
lovechang1986
opened
Mar 21, 2017
HDFS compress tar zip snappy gzip uncompress untar codec hadoop spark
-
Updated
Apr 24, 2018 - Scala
ElasticCTR,即飞桨弹性计算推荐系统,是基于Kubernetes的企业级推荐系统开源解决方案。该方案融合了百度业务场景下持续打磨的高精度CTR模型、飞桨开源框架的大规模分布式训练能力、工业级稀疏参数弹性调度服务,帮助用户在Kubernetes环境中一键完成推荐系统部署,具备高性能、工业级部署、端到端体验的特点,并且作为开源套件,满足二次深度开发的需求。
-
Updated
Jul 11, 2020 - Python
A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
testing
hadoop
scale
performance-metrics
hdfs
testing-tools
performance-analysis
hdfs-dfs
performance-test
performance-testing
hadoop-filesystem
hadoop-framework
scale-up
hadoop-hdfs
-
Updated
Aug 6, 2019 - Java
jcrist
commented
Aug 16, 2018
Given the new key-value store event stream, it'd be nice to have something like:
$ skein kv events <application id> [options...]
where the process blocks, and logs the event stream to the console until interrupted. This would be useful for debugging, as well as demos.
HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
-
Updated
Aug 30, 2019 - Java
Improve this page
Add a description, image, and links to the hdfs topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the hdfs topic, visit your repo's landing page and select "manage topics."
Problem description
Be able to read public GCS files without providing credentials.
Steps/code to reproduce the problem