Skip to content
#

streaming-data

Here are 215 public repositories matching this topic...

trantor
trantor commented Jan 23, 2020

Hello.
I've come across what (to me) seems to be a problem with the FILENAME and FILENUM variables.

# mlr --version
Miller v5.6.2

# cat /tmp/csv1
A,B,C
_2GB,255,2
_4GB,120,4
_6GB,50,6
_10GB,10,10

# cat /tmp/csv2
FIRST,SECOND,THIRD,FOURTH
1,2,3,4
5,6,7,8
9,10,11,12
13,14,15,16

# mlr --icsv cat then put 'print FILENAME'   /tmp/csv1 /tmp/csv2
/tmp/csv1
A=_2GB,B=255,C=2
/
benthos
inliquid
inliquid commented Mar 25, 2020

We use http_server as input and http_client as one of outputs (for a part of message batch). In case when there is some error coming from http_client, benthos starts to retry this error message indefinitely (#415). However most significant, is that it stops accepting other, normal messages.

Here is the log when I first try to send message which causes http_client to get 500 error, and

claudiofahey
claudiofahey commented Feb 3, 2020

Problem description
Documentation for ScalingPolicy.byDataRate does not clearly indicate whether targetKBps uses units of 1000 bytes or 1024 bytes.

Problem location
Pravega client, ScalingPolicy.java

Suggestions for an improvement
Determine the units used and update the documentation.

cybertyche
cybertyche commented Dec 10, 2018

There are already data policies at ingress for data that arrives "late". We can drop, adjust, or throw when data arrives late, and we can hold data in reserve for a certain period of time to allow some reordering.

However, if a data point arrives "too early" we do not have a way to deal with it currently. For instance, if the current data time is X, and the next data point arrives with a timest

wissousad
wissousad commented Aug 6, 2019

Hello, I have a CSV file that has 9 features and 9 expected targets, and I want to test 2 regression models on this data (that should be generated as a stream).

When I test the MultiTargetRegressionHoeffdingTree and RegressorChain on this data I get a bad R2-score, but when I tried normalizing my data with scikit-learn I get a pretty good R2-score. The problem is that I should not use sci

carlbrochu
carlbrochu commented Apr 18, 2019

Describe the bug
Some utilities project are duplicated between singular (utility) and plural versions (utilities). Let's align to plural versions

To Reproduce
Steps to reproduce the behavior:

  1. Go to Services\DataX.Utilities folder
  2. Notice duplicate folders like 2 cosmosdb utils

Expected behavior
One dll per area

cloudflow
goodboy
goodboy commented Feb 10, 2020

As per #98, in #102 we added proper testing of (almost) all examples in the docs (more land in #99 too).

One of the issues is that some of these examples are also re-implemented in the test suite.
The duplication between examples in examples/ and what's in the tests is hard to avoid currently since the test currently take some of the documented examples and parameterize them with things li

Improve this page

Add a description, image, and links to the streaming-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the streaming-data topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.