This business model is brilliant, especially in the area where people would like to automate manual tasks whenever possible. It helps minimize the skilled gap and allows researchers to access the data even if they don’t know coding.”
"I work on big data and Octoparse helps get millions of data easily."
"With 20 servers at my service, data is fetched 20 times faster than using my own script."
"I scheduled my tasks to run hourly and it is helpful in getting streaming data."
The customer is a researcher at the University of Texas, studying social media interactions. In her words, it is a blessing for a researcher to come up with a meaningful research idea. However, data is not all that easy to get and a lack of data can be a hurdle to the truth-seekers. Octoparse web scraping solution makes mass data acquisition easy and achievable for researchers and helps turn insights into actions.
The study looks into live streaming interactions among users on different social media platforms. Yet, gathering dynamic data at scale poses a great challenge to the actual research as streaming data leaves no traces and can only be captured momentarily.
The Web Scraping Journey
Originality plays an important role in the success of academic research. Researchers race around introducing fresh ideas before anyone else gets their hands on them. Our customer, however, was smart enough to come up with a way that naturally builds a moat around her research, one that's powered by a unique set of data not easily attainable through any other means.
However, it hasn't been the best experience ever as the customer attempts to compile the data she needs for her study. While she was capable of writing the scripts herself with Python and R, the real pain was the debugging process. "One time, an exception occurred, only after I've already extracted over a million data lines. All of the sudden, all the data turned into useless bytes and I had to start over again. It was extremely frustrating and costly time-wise," says the customer.
Octoparse came to her by referral as a number of scholars in the university have already been using Octoparse to get data for their researches. She gave it a shot and it turned out to be a perfect tool for her. She was able to fetch the streaming data exactly the way she needed it. Now both of her projects are running smoothly and she is also planning on diving deeper into the topic as her research is getting real results.
Unique research with unique data
"The key is Octoparse can get me unique data which others cannot get, no matter how many scripts they have written for the scraping task... streaming data is not easy to catch and in this sense, cloud extracting is very useful. I would say nobody could reproduce the study as long as they do not know Octoparse."
Faster run than self-built scripts
"Even though I run 4 scripts at one time, I still cannot parallel the speed of Octoparse because there are 20 concurrent runs on the cloud. It is fast."
"I know you guys have a pool of servers and you help rotate IPs during the scraping process. That's why I am not blocked. If I use my own scripts written by Python, I have to purchase IPs to prevent being blocked and put myself in a stressful position worrying about the security issue."
Friendly support ready to help
"The support team gives helpful solutions and guides me through the crawler building. Sometimes they would send me a tested crawler and tell me it would work. This is convenient."
"They are responsive. At the beginning, I got problems every now and then and I would turn to the support team. I learned from them."
Deep dive into what you care
"Both of the two projects are working well at the present. I am going to write a few papers and make the best out of this data."
"Octoparse is for long-term use. I am scraping on a yearly basis for longitudinal research. And there may be some theoretical expansion hence more need for data."