Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find corrupted point for track_sensor #393

Closed
Jean-Romain opened this issue Nov 23, 2020 · 8 comments
Closed

Find corrupted point for track_sensor #393

Jean-Romain opened this issue Nov 23, 2020 · 8 comments
Assignees
Labels
Feature request Asking for a new feature Rejected request Feature request that won't be implemented

Comments

@Jean-Romain
Copy link
Collaborator

Jean-Romain commented Nov 23, 2020

Following numerous issues opened by @lucas-johnson (#392, #391, #388, #336, #327), add a function: diagnose_invalid_pulse().

This could return a table with the index of the point + the points + a comment explaining why a pulse is invalid.

@Jean-Romain Jean-Romain added the Feature request Asking for a new feature label Nov 23, 2020
@Jean-Romain Jean-Romain self-assigned this Nov 23, 2020
@Jean-Romain Jean-Romain added the Rejected request Feature request that won't be implemented label May 27, 2021
@candelas762
Copy link

Hi. Was this ever included in any version of lidR? After having read what I think are all questions and issues with track_sensor() so far, specially #392 , #396 and #595, I still can't find the reason why some parts of catalogs retrive the error Error: After keeping only first and last returns of multiple returns pulses, X pulses still have more than 2 points. This dataset is corrupted and gpstime is likely to be invalid.. I wanted to try dropping those points if they were not many.

@Jean-Romain
Copy link
Collaborator Author

Feature never implemented. Can you share a file with an error ?

@candelas762
Copy link

candelas762 commented Jul 20, 2022

Sure. You can download the file from here. The "retile" file is is a part of the bigger ALS file read as a catalog with opt_chunk_buffer(ctg) <- 0.5 and opt_chunk_size(ctg) <- 250. I isolated that part because is small (easy to share) and still have the error. I also share the whole file in case these intensisty functions need to work with bigger files.

las <- readLAS("C:/retile_578250_6677250.las", filter = "-drop_class 7")
las_check(las)
#> Checking the data
  - Checking coordinates... ✓
  - Checking coordinates type... ✓
  - Checking coordinates range... ✓
  - Checking coordinates quantization... ✓
  - Checking attributes type... ✓
  - Checking ReturnNumber validity... ✓
  - Checking NumberOfReturns validity... ✓
  - Checking ReturnNumber vs. NumberOfReturns... ✓
  - Checking RGB validity... ✓
  - Checking absence of NAs... ✓
  - Checking duplicated points...
    ⚠ 3 points are duplicated and share XYZ coordinates with other points
  - Checking degenerated ground points... ✓
  - Checking attribute population...
    🛈 'EdgeOfFlightline' attribute is not populated
  - Checking gpstime incoherances
    ✗ 80464 pulses (points with the same gpstime) have points with identical ReturnNumber
  - Checking flag attributes... ✓
  - Checking user data attribute...
    🛈 158422 points have a non 0 UserData attribute. This probably has a meaning
 Checking the header
  - Checking header completeness... ✓
  - Checking scale factor validity... ✓
  - Checking point data format ID validity... ✓
  - Checking extra bytes attributes validity... ✓
  - Checking the bounding box validity... ✓
  - Checking coordinate reference system... ✓
 Checking header vs data adequacy
  - Checking attributes vs. point format... ✓
  - Checking header bbox vs. actual content... ✓
  - Checking header number of points vs. actual content... ✓
  - Checking header return number vs. actual content... ✓
 Checking coordinate reference system...
  - Checking if the CRS was understood by R... ✓
 Checking preprocessing already done 
  - Checking ground classification... yes
  - Checking normalization... no
  - Checking negative outliers... ✓
  - Checking flightline classification... yes
 Checking compression
  - Checking attribute compression... no

las <- filter_duplicates(las) # I think that track_sensor() does this automatically 
las <-  filter_firstlast(las)
sensor <- track_sensor(las, Roussel2020(pmin = 500), multi_pulse = T)
#> Error: After keeping only first and last returns of multiple returns pulses, 4 pulses still have 
#> more than 2 points. This dataset is corrupted and gpstime is likely to be invalid.

@Jean-Romain
Copy link
Collaborator Author

I see

 - Checking gpstime incoherances
    ✗ 80464 pulses (points with the same gpstime) have points with identical ReturnNumber

Label the incorrect ones

las = filter_firstlast(las)
t = las@data[, .N, by = gpstime]
err = t[N > 2]
las@data$err = las$gpstime %in% err$gpstime

Plot the incorrect ones

plot(las, color = "err")

Show one of the inccorect ones

tt = err$gpstime
las@data[gpstime == tt[1]]
#>          X       Y      Z   gpstime Intensity ReturnNumber NumberOfReturns ScanDirectionFlag EdgeOfFlightline Classification Synthetic_flag Keypoint_flag
#> 1: 578410.5 6677400 255.02 114610821         4            1               2                 0                0              1          FALSE         FALSE
#> 2: 578412.2 6677401 243.03 114610821        10            2               2                 0                0              1          FALSE         FALSE
#> 3: 578411.0 6677400 251.70 114610821       127            1               1                 0                0              1          FALSE         FALSE
#>    Withheld_flag ScanAngleRank UserData PointSourceID  err
#> 1:         FALSE            -8        1             1 TRUE
#> 2:         FALSE            -8        1             1 TRUE
#> 3:         FALSE            -8        1             1 TRUE

Plot those 3 points

sub = las[las$gpstime == tt[1]]
plot(sub, size = 4)

The 3 points are perfectly aligned. I'd say they are coming from the same pulse but numbering is incorrect. This pulse is a 3 returns pulse and points should have 1 2 3 in return number and 3 3 3 in number of return. You have probably 80000+ point of this kind. You dataset is incorrectly populated and the function is working as expected.

@candelas762
Copy link

candelas762 commented Jul 20, 2022

Hi. Thanks for taking your time and looking throught the data. This piece of code is very helpful:

las = filter_firstlast(las)
t = las@data[, .N, by = gpstime]
err = t[N > 2]
las@data$err = las$gpstime %in% err$gpstime

Now I was thinking if it would be possible to solve this type of error by re-ranking the ReturnNumber attribute according to the Z coordinates, assuming the points with same gpstime come from the same pulse. I have the flightlines of the aircraft (I have uploaded into the same folder you found the LAS file) and would be able to check if the calculated position of the sensor is the approximatley the same.

I have made some lines for this but is really slow as it goes one by one of the detected wrong pulses in tt:

tt = err$gpstime

for (i in 1:length(tt)){
  n = which(las$gpstime == tt[i]) # Which position have the points in the cloud
  sub = las[las$gpstime == tt[i]] # subset the cloud
  
  rn = as.integer(rank(-sub$Z))   # New return numbers according to Z values
                  
  las$ReturnNumber[n] = rn        # Substitute old RN for the new ones
}

sensor <- track_sensor(las, Roussel2020(), multi_pulse = T)

The resulting sensor position is promising but I have only tried to run it for a "retile" file because it takes a long time if the number of wrong pulses is high, like in my case.

@Jean-Romain
Copy link
Collaborator Author

Jean-Romain commented Jul 20, 2022

The fastest way is

library(lidR)
library(data.table)
las <- readLAS("retile_578250_6677250.las", filter = "-drop_class 7")
setorder(las@data, gpstime, UserData, Z)
las@data[, `:=`(ReturnNumber = 1:.N, NumberOfReturns = .N), by = .(gpstime, UserData)]
las = las_update(las)

Notice that you missed to processed by user data because you have a multi pulse emission device. This is also why las_check() still report gpstime error. It does not accounts for multi pulse that have the same gpstime but different UserData.

las = filter_firstlast(las)
t = las@data[, .N, by = .(gpstime, UserData)]
setorder(t, N)
err = t[N > 2]
err
#> Empty data.table (0 rows and 3 cols): gpstime,UserData,N

@candelas762
Copy link

Amazing that line of code. Thanks a lot.

I included those lines in case the error came up and compared with the actual flightlines shapefile. It aparantley solves the corrupted points and the sensor is well-positioned.

In the image you can see the las file read as a catalog, the flightlines colored by ID, and the black points are the sensor positions calculated for the catalog and the red circles are the sensor positions calculated just for the first retile read as a las, which is pink colored.
test_rnadjust

It seems like the bigger the data, the better positioned the sensor gets.

I was wondering if given the case where you have access to the actual flight lines, it was possible to make your own "track_sensor()" result.

@Jean-Romain
Copy link
Collaborator Author

It seems like the bigger the data, the better positioned the sensor gets.

Of course. With a tiny are you have very few points to resolve the interpolation problem. But the improvement is not very big actually

I was wondering if given the case where you have access to the actual flight lines, it was possible to make your own "track_sensor()" result.

If the shapefile is made of lines I guess it is not associated with timestamps. We need a time information to match the point with the position of the sensor. You can sample the lines into discrete points but if the points are not associated with corresponding gpstime it won't work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request Asking for a new feature Rejected request Feature request that won't be implemented
Projects
None yet
Development

No branches or pull requests

2 participants