data-science
- Sign up for GitHub or sign in to edit this page
Here are 9,075 public repositories matching this topic...
Might be worth adding a return_centers parameter to make_blobs.
Typically useful for comparing with e.g. GaussianMixture.means_ or KMeans.cluster_centers_, when the centers are randomly generated by make_blobs
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
-
Updated
Nov 22, 2019 - Jupyter Notebook
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
Updated
Nov 22, 2019 - Python
I got a conllU file, from my university, where the head column is filled with .
Processing such file with the cli.convert method will result in a int cast error in
https://github.com/explosion/spaCy/blob/master/spacy/cli/converters/conllu2json.py line 73
in the read_conllx method (head = (int(head) - 1) if head != "0" else id).
In the format documentation on https://universaldependencie
The API documentation on readthedocs contains a link "Edit on github" in the header. Clicking on the link brings us to a 404 github page:
For example,
Documentation page: https://ipython.readthedocs.io/en/latest/api/generated/IPython.conftest.html
Github link: https://github.com/ipython/ipython/blob/master/docs/source/api/generated/IPython.conftest.rst
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
-
Updated
Nov 22, 2019 - Python
Your new Mentor for Data Science E-Learning.
-
Updated
Nov 22, 2019 - Jupyter Notebook
:memo: An awesome Data Science repository to learn and apply for real world problems.
-
Updated
Nov 22, 2019
latex support
I tried to use latex in dash, but it is not working.
It seems that the mathjax javacript library is not loaded.
Blueprint
The "Python Machine Learning (1st edition)" book code repository and info resource
-
Updated
Nov 22, 2019 - Jupyter Notebook
Python 3.7.3, gensim 3.8.1: UnboundLocalError: local variable 'doc_no2' referenced before assignment
Client code:
model = LogEntropyModel(corpus=data_corpus, normalize=True)
Referenced code:
https://github.com/RaRe-Technologies/gensim/blob/44ea7931c916349821aa1c717fbf7e90fb138297/gensim/models/logentropy_model.py#L115
Exception thrown:
File "/anaconda3/lib/python3.7/site-packages/gensim/models/logentropy_model.py", line 76, in __init__
self.initialize(corpus)
File
Dive into Machine Learning with Python Jupyter notebook and scikit-learn!
-
Updated
Nov 21, 2019
VIP cheatsheets for Stanford's CS 229 Machine Learning
-
Updated
Nov 22, 2019
See ray-project/ray#5626 -- the refactored DDPG somehow doesn't work on certain tasks such as MountainCar.
Deep learning library featuring a higher-level API for TensorFlow.
-
Updated
Nov 22, 2019 - Python
A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
-
Updated
Nov 22, 2019
A curated list of awesome big data frameworks, ressources and other awesomeness.
-
Updated
Nov 22, 2019
Question
- Can PretrainedTransformerTokenizer track character offset like WordTokenizer?
Since character offset is important to calculate answer span after wordpiece tokenization?
i'm a newbie in programming. I try to use this library. it's very useful for me.
I want to show centroid in K-means clustering. how to show it? thank u so much..
Svr training error
Describe the bug
If you enter a string into a cell such as "3" and then use "toNumber()" to convert it to a number, toNumber() first tries to parse as a Long, and then tries parsing as a Double. So "3" will be stored as a Long, while "3.0" will be stored as a Double
However, if you edit a cell by using the single cell "edit" option and enter a number, and change the 'type' to Number, OR on
Description
@wutaomsft suggestion:
it would be a good discussion point what is preferred way to make references in notebooks. I prefer not to have a"reference" section where references are separate from where they are referred. Instead, link the reference "in place". And then add a paragraph for "additional reading", which is more descr
I can not find a guide on choosing TPOT parameters. I know the API is explained in the documents but its too brief. TPOT seems made for users unskilled in ML and GP. I made another issue with my many questions. "We recommend using the default parameter unless you understand how the mutation rate affects GP algorithms. " should have a link.
Open Machine Learning Course
-
Updated
Nov 22, 2019 - Python
Tutorials, assignments, and competitions for MIT Deep Learning related courses.
-
Updated
Nov 22, 2019 - Jupyter Notebook
Today you can put Streamlit in "wide mode" via the Settings dialog in the UI. However, it would be great if the wide mode setting were sticky.
Option 1: just make Wide Mode sticky by persisting it in local storage!
Option 2: Provide a config option that toggles wide mode:
[browser]
wideMode = True
(for this we'd have to replicate much of the code used to propagate settin
The "Python Machine Learning (2nd edition)" book code repository and info resource
-
Updated
Nov 22, 2019 - Jupyter Notebook
Would be great to have new option in Pool. Just like cat_features list of numbers or column names.
Is it a known issue (is it even an issue?) that model.test_on_batch returns the sum of losses of each entry in the batch instead of the average? I looked over the changelog and saw no reference to that.
model.train_on_batch does in fact returns the average, but in the docs their return value is documented the same.