headless-chrome

Tell us about your environment:

Puppeteer version: 1.14
Platform / OS version: Windows
Node.js version: 10

What steps will reproduce the problem?
try to pass a promise to await page.waitForResponse(response => condition) instead of urlOrPredicate.

What is the expected result?
expect async function to work

What happens instead?
no waiting as promise

Here is the code snippet:

nightmare
    .on('console', (log, msg) => {
        console.log(msg)
    })
    .on('error', (err) => {
        console.log(err)
    })
    .goto(url)
    .inject('js', 'jquery.min.js')
    .wait('#btnSearchClubs')
    .click('#btnSearchClubs')
    .wait(5000)
    .evaluate(function () {
        const pageAnchor = Array.from(document.querySelectorAll(

Page Requirements

ability to add a single page to the archive (like echo <url> | archivebox add)
ability to import a list of pages / feed of URLs into the archive (like archivebox add <url>)
link to the homepage of the index /
link to the django admin list of URLs for editing the archive /admin/
link to the archivebox github repo & documentati

Consider when the rendertron-middleware is deployed for example on Cloud Run behind a Firebase Hosting site, where requests are directed via Firebase Hosting which has a different domain/hostname from the Cloud Run instance.

The middleware uses req.get('host') which

What is the current behavior?

Crawling a website that uses # (hashes) for url navigation does not crawl the pages that use #

The urls using # are not followed.

If the current behavior is a bug, please provide the steps to reproduce

Try crawling a website like mykita.com/en/

What is the motivation / use case for changing the behavior?

Though hashes are not ment to chan

var pgsql = require('pdf-bot/src/db/pgsql')

module.exports = {
  api: {
    token: 'api-token'
  },
  db: pgsql({
    database: 'pdfbot',
    username: 'pdfbot',
    password: 'pdfbot',
    port: 5432
  }),
  webhook: {
    secret: '1234',
    url: 'http://localhost:3000/webhooks/pdf'
  }
}

Can you change pls "username" to "user" cause it's the the correct option ther

When users run Apify.launchPuppetter() on Docker image without Chromium, they see:

Error: Failed to launch chrome! spawn /usr/src/app/node_modules/puppeteer/.local-chromium/linux-706915/chrome-linux/chrome ENOENT

We should show some better error telling them how to fix it.

Actual

Test environments don't set up certificates. Using Taiko against these environments produces certificate errors. For example

goto("https://172.0.1.111:1234") 
[FAIL] Error: Navigation to url https://172.0.1.111:1234 
failed.
REASON: net::ERR_CERT_AUTHORITY_INVALID, run .trace for more info.

Change

Ignore certificates by default. The tester can choose to not ig

I'm trying to follow the quick guide, I've done these steps:

serverless create -u https://github.com/adieuadieu/serverless-chrome/tree/master/examples/serverless-framework/aws
export AWS_PROFILE=serverless
npm run deploy

However at point 3, I'm not in the right folder am I?

If I cd aws and the run npm run deploy I get an error:

Serverless Error ---------------

extend the docs on how the bot can be used with GitHubActions.

GithubActions is a neat and fast new way to interact with githubs, its worksflows etc.
its much more powerfull then the regular CI systems we are used to.

this would make it possible to add things like "trigger the lighthousebot via a keyword in a comment" or similar

Issue
When using SingleBrowserImplementation and chrome gets into a state in which it cannot be restarted then the error does not bubble which causes a javascript unhandledrejection. Since there is no way to catch this it forces consuming code into a dead end. Using node v8.11.1

Reproduction:
I have not found a way to put chrome into such a state that it cannot be restarted so the rep

The graphql version referenced by navalia@1.3.0 and the jest version referenced by create-react-app@1.1.1 seem to be incompatible with one another. Gives an error message:

 FAIL  src/react.spec.js
  ● Test suite failed to run

    /Users/sgreene/src/tutorials/test-navalia/node_modules/graphql/index.mjs:2
    export { graphql, graphqlSync } from './graphql';
    ^^^^^^

Steps to re

Pretty simple stuff: I couldn't find examples for file flag when checking the docs. Also, the info presented when running --help doesn't say much. Ended up checking the file.go to get the proper syntax and then came across this:

$ gowitness file -s ~/Desktop/urls
$ gowitness file --source ~/Desktop/urls --threads -2

so maybe just add it directly to the main docs?

Thanks

@djerrystyle

目前我的方法是拼接, 比如 http://www.A.com, 已知了两个路径: /path_a,/path_b
那么命令为: crawlergo -c chrome http://www.A.com/ http://www.A.com/path_a http://www.A.com/path_b

有两个问题:

如果已知路径比较多, 手工拼接比较麻烦
这种拼接传参的方法和分开一个个执行得到的结果是一样? 还是说有差别,没有进行验证.

当然后期能有参数支持多路径作为入口最好不过.

Originally posted by @djerrystyle in 0Kee-Team/crawlergo#31 (comment)

None of the completion triggers worked for my react app. I checked puppeteer's waitUntil: "networkidle0" and that worked for me. https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagegotourl-options

headless-chrome

Here are 328 public repositories matching this topic...

puppeteer / puppeteer

segmentio / nightmare

prisma-archive / chromeless

pirate / ArchiveBox

Page Requirements

alvarcarto / url-to-pdf-api

GoogleChrome / rendertron

yujiosaka / headless-chrome-crawler

miyakogi / pyppeteer

esbenp / pdf-bot

apifytech / apify-js

getgauge / taiko

Actual

Change

adieuadieu / serverless-chrome

GoogleChromeLabs / lighthousebot

checkly / puppeteer-examples

berstend / puppeteer-extra

transitive-bullshit / awesome-puppeteer

thomasdondorf / puppeteer-cluster

zhentaoo / puppeteer-deep

joelgriffith / navalia

rubycdp / ferrum

nesk / puphpeteer

sensepost / gowitness

0Kee-Team / crawlergo

csbun / thal

OnetapInc / chromy

ebidel / try-puppeteer

westy92 / html-pdf-chrome

rubycdp / cuprite

sambaiz / puppeteer-lambda-starter-kit

NimaSoroush / differencify

Improve this page

Add this topic to your repo