0

I would like to find a way to first bypass the issue of having a loop stoping prematurely when encountering a non existing underlying element within a vector element to loop over, producing a subscript out of bounds message.

In this case I would like to place NA in the index location where the error occurs and to be able to continue/baypass performing the next loops across the remaining elements of the loop either existing or non existing.

Find example below. The loop iteration has six elements but due to the non existence of an underlying element within the link it stops after the 4th iteration producing the subscript out of bounds message.

library(xml2)
library(XML)
library(rvest)
country <-list("d1756170-Reviews-La_Trobada_Hotel_Boutique-Ripoll","d1447619-Reviews- 
Solana_del_Ter-Ripoll","d9998932-Reviews-La_Trobada_Hotel_Sport-Ripoll","d16830029-Reviews- 
Subirana_Rural-Les_Llosses","d2219428-Reviews-La_Sequia_Molinar-Campdevanol","d13803410- 
Reviews-Angelats_Hotel-Ribes_de_Freser")
for ( i in country ) {
Stars <- as.numeric (substr ( html_attr ( xml_child ( xml_child (html_node ( read_html ( paste 
( "https://www.tripadvisor.co.uk/Hotel_Review-g1063979- 
",i,"_Province_of_Girona_Catalonia.html", sep="")),"._2dtF3ueh"))), "aria- 
label"),start=1,stop=1))
print (Stars)
}

[1] 2
[1] 3
[1] 2
Error in xml_children(x)[[search]] : subscript out of bounds

The output I aim to obtain should look like.

[1] 2
[1] 3
[1] 2
[1] NA
[1] 3
[1] 3
Barnaby
  • 1,412
  • 4
  • 20
  • 33

0 Answers0