採択 2021/10/03 10:35〜 Track4 Regular session (25 mins)

Best practices for using PHP to develop web crawlers! PHP Conference Japan 2021

peter279k Peter peter279k

At the recent years, I used a lot PHP to develop my web crawlers and save my life to make some duplicated works automated.
I also research them and write a book (it's written in Mandarin). And the book link is available here: https://www.books.com.tw/products/0010882656
This book tile translates to English is: PHP Web Crawler Development:From beginner to advanced web crawler technique guides.

In this session, I will present following topics:

  • The motivation about writing this book.
  • The guide about web crawler developments and sections introduction
    • Basic HTTP fundamentals and the tips for inspect HTTP packets and requests via web browser.
    • The PHP web crawler environment building demonstration.
    • Using the advanced web crawling techniques. Such as headless Chrome web browser.
    • Some Labs and live demonstration about developing crawler.
  • The feedback and experiences about publishing this book.
  • Extended references and other advanced crawling tips that not included in books.
    • Inspecting HTTP packets and requests via non-web browser.
    • Advanced extract/decode CAPTCHA codes.

Discord Channel: #track4-4-b-php-web-crawlers
Joind.in: https://joind.in/event/php-conference-japan-2021/best-practices-for-using-php-to-develop-web-crawlers