Flood datasets often combine field reports, remote sensing, news records, and administrative summaries. Cleaning is required before modelling or dashboard visualization.
1. Remove duplicates
Check incident date, location, source, and description to avoid counting the same event multiple times.
2. Standardize locations
Convert local names into district, municipality, ward, basin, and coordinate fields when possible.
3. Separate hazard and impact
Flood occurrence, inundation area, damaged houses, casualties, and road closure are different fields and should not be mixed.
4. Check time windows
For satellite flood extents, separate pre-event, event, and post-event windows clearly.
5. Keep raw and cleaned copies
Store the original dataset and the cleaned analysis version separately for transparency.
Keep the workflow simple: define the input, check the geometry or data source, validate the output, and then document the assumption inside the drawing, model, or dashboard.