The Ultimate Guide to PyTrends: the Google Trends API (with ...

文章推薦指數: 80 %
投票人數:10人

Google Trends is a public platform that you can use to analyze interest over time for a given topic, search term, and even company. Pytrends is ... Skiptocontent ByLazarinaStoyPostedonOctober29,2021March10,2022TaggedAPI,GoogleTrends,Python API,GoogleTrends,Python TableOfContents FrequentlyAskedQuestionsWhyusetheGoogleTrendsAPIinsteadoftheGoogleTrendsWebinterface? WhatdoGoogleTrendsvaluesactually denote? WhatdatacanyoupullwiththeGoogleTrends API? Whatparameterscanyouspecifyinyour queries?ArethereanylimitationstousingthePytrendsGoogleTrends API? 1.Searchtermsandtopicsaretwodifferentthings2.Disproportionateresults3.KeywordLengthLimitations4.Alldataisrelative,notabsolute5.Thecategoriesareunreliableatbest.6.Youcanonlyprovidefiveentriesperchart.WhatAPIMethodsareavailablewiththeGoogleTrends API?autoCompletedailyTrends interestOverTimeinterestByRegion realtimeTrendsrelatedQueriesrelatedTopicsWhattolookoutfor next…  FrequentlyAskedQuestions WhyusetheGoogleTrendsAPIinsteadoftheGoogleTrendsWebinterface?Thereisnoproblemwithjustusingthewebinterface,however,whendoingalarge-scaleproject,whichrequiresbuildingalargedataset — thismightbecomeverycumbersome.ManuallyresearchingandcopyingdatafromtheGoogleTrendssiteisaresearchandtime-intensiveprocess.WhenusinganAPI,thistimeandeffortarecutdramatically. ArethereanylimitationstousingthePytrendsGoogleTrendsAPI?Yes,thereare.Beforeyoubeginyoumustbeawareofthesefewthings:1)Searchtermsandtopicsaretwodifferentthings.2)Theresultsaredisproportionate.3)Therearekeywordlengthlimitations.4)Alldataisrelative,notabsolute.5)Thecategoriesareunreliableatbest.6)Youcanonlyprovidefiveentriesperchart. WhatdoGoogleTrendsvaluesactuallydenote?AccordingtoGoogleTrends,thevaluesarecalculatedonascalefrom0to100,where100isthelocationwiththemostpopularityasafractionoftotalsearchesinthatlocation,avalueof50indicatesalocationthatishalfaspopular.Avalueof0indicatesalocationwheretherewasnotenoughdataforthisterm. Exploresearchdataat scale GoogleTrendsisapublicplatformthatyoucanusetoanalyzeinterestovertimeforagiventopic,searchterm,andevencompany.  PytrendsisanunofficialGoogleTrendsAPIthatprovidesdifferentmethodstodownloadreportsoftrendingresultsfromgoogletrends.ThePythonpackagecanbeusedforautomationofdifferentprocessessuchasquicklyfetchingdatathatcanbeusedformoreanalyseslateron.  Inthisarticle,IwillsharesomeinsightsonwhatyoucandowithPytrends,howtodobasicdatapulls,providingsnippetsofPythoncodealongtheway.IwillalsoanswersomeFAQsaboutGoogleTrendsandmostimportantly — addressthelimitationsofusingtheAPIandthedata. WhyusetheGoogleTrendsAPIinsteadoftheGoogleTrendsWebinterface?  Thereisnoproblemwithjustusingthewebinterface,however,whendoingalarge-scaleproject,whichrequiresbuildingalargedataset — thismightbecomeverycumbersome.  ManuallyresearchingandcopyingdatafromtheGoogleTrendssiteisaresearchandtime-intensiveprocess.WhenusinganAPI,thistimeandeffortarecutdramatically. WhatdoGoogleTrendsvaluesactually denote?  AccordingtoGoogleTrends,thevaluesarecalculatedonascalefrom0to100,where100isthelocationwiththemostpopularityasafractionoftotalsearchesinthatlocation,avalueof50indicatesalocationthatishalfaspopular.Avalueof0indicatesalocationwheretherewasnotenoughdataforthisterm. WhatdatacanyoupullwiththeGoogleTrends API?  RelatedtoaparticularkeywordyouprovidetotheAPI,youcanpullthefollowingdata: InterestOverTimeHistoricalHourlyInterestInterestbyRegionRelatedTopicsRelatedQueriesTrendingSearchesTopChartsKeywordSuggestions WewillexplorethedifferentmethodsthatareavailableintheAPIforpullingthisdatainabit,alongsidehowthesyntaxforeachofthesemethodslookslike.  Whatparameterscanyouspecifyinyour queries? Therearetwoobjectsthatyoucanspecifyparametersfor:  optionsObjectcallback  Thecallbackisanoptionalfunction,wherethefirstparameterisanerrorandthesecondparameteristheresult.Ifnocallbackisprovided,thenapromiseisreturned. constgoogleTrends=require('google-trends-api');googleTrends.apiMethod(optionsObject,[callback]) TheoptionsObjectisanobjectwiththefollowingoptionskeys: keyword(required) — Targetsearchterm(s)stringorarray startTime — Startofthetimeperiodofinterest(newDate()object).IfstartTimeisnotprovided,dateofJanuary1,2004,isassumedasthisistheoldestavailablegoogletrendsdataendTime — Endofthetimeperiodofinterest(newDate()object).IfendTimeisnotprovided,thecurrentdateisselected.geo — locationofinterest(stringorarrayifyouwishtoprovideseparatelocationsforeachkeyword).hl — Preferredlanguage(stringdefaultstoEnglish)timezone — Timezone(numberdefaultstothetimezonedifference,inminutes,fromUTCtocurrentlocale(hostsystemsettings))category — thecategorytosearchwithin(numberdefaultstoallcategories)property — Googlepropertytofilteron.Defaultstoawebsearch.(enumeratedstring[‘images’,‘news’,‘youtube’or‘froogle’]thelatterrelatingtoGoogleShoppingresults)resolution — Granularityofthegeosearch(enumeratedstring[‘COUNTRY’,‘REGION’,‘CITY’,‘DMA’]).resolutionisspecifictotheinterestByRegionmethod.granularTimeResolution — Booleanthatdictatesiftheresultsshouldbegiveninafinertimeresolution(ifstartTimeandendTimeislessthanoneday,thisshouldbesettotrue) ArethereanylimitationstousingthePytrendsGoogleTrends API?  Yes,thereare.Beforeyoubeginyoumustbeawareofthesefewthings: 1.Searchtermsandtopicsaretwodifferentthings SearchtermsandTopicsaremeasureddifferently,sorelatedTopicswillnotworkwithcomparisonsthatcontainbothSearchtermsandTopics. Thisleadstoduplicateentries.  ThisissomethingeasilyobservableintheGoogleTrendsUI,whichsometimesoffersseveraltopicsforthesamephrase. 2.Disproportionateresults Whenusingtheinterestbyregionmodule,ahighervaluemeansahigherproportionofallqueries,notahigherabsolutequerycount. Soasmallcountrywhere80%ofthequeriesarefor“Google”willgettwicethescoreofagiantcountrywhereonly40%ofthequeriesareforthatterm. 3.KeywordLengthLimitations Googlereturnsaresponsewithcode400whenakeywordis>100characters. 4.Alldataisrelative,notabsolute ThedataGoogleTrendsshowsyouarerelative,notabsolute.ForbesBaxterAssociatesexplainsthisneatly:  Lookatthechartforsearchesin2019.Whenyouseetheredlineonthechartreaching100inaboutJune,itdoesn’tmeantherewere100searchesforthatterminJune.Itmeansthatwasthemostpopularsearchin2019andthatithititspeakinJune. 5.Thecategoriesareunreliableatbest. Therearesometop-levelcategories,buttheyarenotrepresentativeoftherealinterestanddata.  Therearecaseswherethecategoriesandthedatadon’trepresentthereal-lifeoperations,andthismaybeduetoalackofunderstandingfromthesearcher,falselyattributedintent,oranalgorithmbug.  Anotherlimitationisthatyoucanonlypickonecategory.Butifyouneedtochoosemorethanoneduetoadiscrepancybetweenthedatainthetwocategories,thenthisbecomesachallengeforthenextstepsindataconsolidation,visualization,andanalysis. 6.Youcanonlyprovidefiveentriesperchart. Thiscanbereallyannoying.IfyouareusingtheAPIforprofessionalpurposes,suchasanalyzingaparticularmarket,thismakesthereportingreallychallenging.  Mostmarketshavemorethanfivecompetitorsinthem.Mosttopicshavemorethanfivekeywordsinthem.Comparisonsneedcontextinordertowork.  WhatAPIMethodsareavailablewiththeGoogleTrends API? ThefollowingAPImethodsareavailable: autoComplete Returnstheresultsfromthe“Addasearchterm”inputboxinthegoogletrendsUI.  #installpytrends !pipinstallpytrends #importthelibraries importpandasaspd frompytrends.requestimportTrendReq pytrend=TrendReq() #GetGoogleKeywordSuggestions keywords=pytrend.suggestions(keyword='Facebook') df=pd.DataFrame(keywords) df.head(5) Author’sown dailyTrends  DailySearchTrendshighlightssearchesthatjumpedsignificantlyintrafficamongallsearchesoverthepast24hoursandupdateshourly.  Thesetrendshighlightspecificqueriesthatweresearched,andanabsolutenumberofsearchesmade.  20dailytrendingsearchresultsarereturned.Here,aretroactivesearchforupto15daysbackcanalsobeperformed.  #installpytrends !pipinstallpytrends #importthelibraries importpandasaspd frompytrends.requestimportTrendReq pytrend=TrendReq() #gettoday'strenidingtopics trendingtoday=pytrend.today_searches(pn='US') trendingtoday.head(20) Youcanalsogetthetopicsthatweretrendinghistorically,forinstanceforaparticularyear. #GetGoogleTopCharts df=pytrend.top_charts(2020,hl='en-US',tz=300,geo='GLOBAL') df.head() Output:  imageby author interestOverTime Numbersrepresentsearchinterestrelativetothehighestpointonthechartforthegivenregionandtime.  Ifyouusemultiplekeywordsforcomparison,thereturndatawillalsocontainanaverageresultforeachkeyword. Youcanchecktheregionalinterestformultiplesearchterms. #importthelibraries importpandasaspd frompytrends.requestimportTrendReq pytrend=TrendReq() #provideyoursearchterms kw_list=['Facebook','Apple','Amazon','Netflix','Google'] #searchinterestperregion #runmodelforkeywords(canalsobecompetitors) pytrend.build_payload(kw_list,timeframe='today1-m') #InterestbyRegion regiondf=pytrend.interest_by_region() #lookingatrowswhereallvaluesarenotequalto0 regiondf=regiondf[(regiondf!=0).all(1)] #dropallrowsthathavenullvaluesinallcolumns regiondf.dropna(how='all',axis=0,inplace=True) #visualise regiondf.plot(figsize=(20,12),y=kw_list,kind='bar') imageby author Youcanalsogethistoricalinterestbyspecifyingatimeperiod.  #historicalinterest historicaldf=pytrend.get_historical_interest(kw_list,year_start=2020,month_start=10,day_start=1,hour_start=0,year_end=2021,month_end=10,day_end=1,hour_end=0,cat=0,geo='',gprop='',sleep=0) #visualise #plotatimeserieschart historicaldf.plot(figsize=(20,12)) #plotseperategraphs,usingtheprovidedkeywords historicaldf.plot(subplots=True,figsize=(20,12)) Thishastobemyfavoriteoneasitenablessupercooladditionalprojectssuchasforecasting,calculatingtheshareofsearch(ifusingcompetitorsasinput)andothercoolmini-projects.  interestByRegion  Thisallowsexaminingsearchtermpopularitybasedonlocationduringthespecifiedtimeframe. Valuesarecalculatedonascalefrom0to100,where100isthelocationwiththemostpopularityasafractionoftotalsearchesinthatlocation,avalueof50indicatesalocationthatishalfaspopular,andavalueof0indicatesalocationwherethetermwaslessthan1%aspopularasthepeak. #installpytrends !pipinstallpytrends #importthelibraries importpandasaspd frompytrends.requestimportTrendReq #createmodel pytrend=TrendReq() #provideyoursearchterms kw_list=['Facebook','Apple','Amazon','Netflix','Google'] #getinterestbyregionforyoursearchterms pytrend.build_payload(kw_list=kw_list) df=pytrend.interest_by_region() df.head(10) realtimeTrends RealtimeSearchTrendshighlightstoriesthataretrendingacrossGooglesurfaceswithinthelast24hoursandareupdatedinreal-time.  #installpytrends !pipinstallpytrends #importthelibraries importpandasaspd frompytrends.requestimportTrendReq pytrend=TrendReq() #GetrealtimeGoogleTrendsdata df=pytrend.trending_searches(pn='united_states') df.head() relatedQueries Userssearchingforyourtermalsosearchedforthesequeries.Thefollowingmetricsarereturned: Top — Themostpopularsearchqueries.Scoringisonarelativescalewhereavalueof100isthemostcommonlysearchedquery,50isaquerysearchedhalfasoften,andavalueof0isaquerysearchedforlessthan1%asoftenasthemostpopularquery.Rising — Querieswiththebiggestincreaseinsearchfrequencysincethelasttimeperiod.Resultsmarked“Breakout”hadatremendousincrease,probablybecausethesequeriesarenewandhadfew(ifany)priorsearches. CheckoutthefullcodeintheCollablink. #installpytrends !pipinstallpytrends #importthelibraries importpandasaspd frompytrends.requestimportTrendReq fromgoogle.colabimportfiles #buildmodel pytrend=TrendReq() #provideyoursearchterms kw_list=['Facebook','Apple','Amazon','Netflix','Google'] pytrend.build_payload(kw_list=kw_list) #getrelatedqueries related_queries=pytrend.related_queries() related_queries.values() #buildlistsdataframes top=list(related_queries.values())[0]['top'] rising=list(related_queries.values())[0]['rising'] #convertliststodataframes dftop=pd.DataFrame(top) dfrising=pd.DataFrame(rising) #jointwodataframes joindfs=[dftop,dfrising] allqueries=pd.concat(joindfs,axis=1) #functiontochangeduplicates cols=pd.Series(allqueries.columns) fordupinallqueries.columns[allqueries.columns.duplicated(keep=False)]: cols[allqueries.columns.get_loc(dup)]=([dup+'.'+str(d_idx) ifd_idx!=0 elsedup ford_idxinrange(allqueries.columns.get_loc(dup).sum())] ) allqueries.columns=cols #renametopropernames allqueries.rename({'query':'topquery','value':'topqueryvalue','query.1':'relatedquery','value.1':'relatedqueryvalue'},axis=1,inplace=True) #checkyourdataset allqueries.head(50) #savetocsv allqueries.to_csv('allqueries.csv') #downloadfromcollab files.download("allqueries.csv") relatedTopics Userssearchingforyourtermalsosearchedforthesetopics.Thefollowingmetricsarereturned: Top — Themostpopulartopics.Scoringisonarelativescalewhereavalueof100isthemostcommonlysearchedtopic,avalueof50isatopicsearchedhalfasoften,andavalueof0isatopicsearchedforlessthan1%asoftenasthemostpopulartopic.Rising — Relatedtopicswiththebiggestincreaseinsearchfrequencysincethelasttimeperiod.Resultsmarked“Breakout”hadatremendousincrease,probablybecausethesetopicsarenewandhadfew(ifany)priorsearches. Thesyntaxhereisthesameasabove,withthechangeonlyintworows,whererelated_queriesarementioned: #RelatedTopics,returnsadictionaryofdataframes related_topic=pytrend.related_topics() related_topic.values() Whattolookoutfor next…  Hopeyouenjoyedthisexploration.  YoucanfindallofthecodecompiledintooneCollabbelow(⬇️Scrolldowntothebottomofthepagetoview🚀) Mynextarticlewillexploreandgoin-depthinto4Beginner-FriendlyPythonProjectsYouCanUseGoogleTrendsIn.  Staytunedandthanksforreading.  Inthemeantime,checkouttheseresourcescreatedbybrilliantpeople:  GoogleTrendsAPIexploration:GoogleTrendsAPIforPython ScriptasafunctionforgettingdailysearchdatausingPytrends:pytrends/dailydata.pyatmaster·GeneralMills/pytrends TheUltimateGuidetoPyTrends:theGoogleTrendsAPI(withPythoncodeexamples) Relatedposts: 5WaystoConnectSemrushWithDataStudio 5WaystoConnectHubspottoDataStudio GoogleSearchConsoleURLinspectionAPIinDataStudio(freedashboardtemplate) Postnavigation AccelerateYourPageExperienceandCoreWebVitalsReportingwithDataStudioHowtoIntegrateNightwatchAPIinGoogleSheetsandDataStudioviaGoogleScripts Contents hide FrequentlyAskedQuestions Exploresearchdataatscale WhyusetheGoogleTrendsAPIinsteadoftheGoogleTrendsWebinterface? WhatdoGoogleTrendsvaluesactuallydenote? WhatdatacanyoupullwiththeGoogleTrendsAPI? Whatparameterscanyouspecifyinyourqueries? ArethereanylimitationstousingthePytrendsGoogleTrendsAPI? 1.Searchtermsandtopicsaretwodifferentthings 2.Disproportionateresults 3.KeywordLengthLimitations 4.Alldataisrelative,notabsolute 5.Thecategoriesareunreliableatbest. 6.Youcanonlyprovidefiveentriesperchart. WhatAPIMethodsareavailablewiththeGoogleTrendsAPI? autoComplete dailyTrends interestOverTime interestByRegion realtimeTrends relatedQueries relatedTopics Whattolookoutfornext… Relatedposts: YouTube Twitter LinkedIn Medium GitHub Scrolltotop



請為這篇文章評分?