manbatman

This commit is contained in:
Villers Krisztián 2026-05-12 16:00:09 +02:00
parent 67c7394114
commit c560579285
6 changed files with 3689 additions and 0 deletions

45
.github/copilot-instructions.md vendored Normal file
View File

@ -0,0 +1,45 @@
# Copilot Instructions for `o8_pics_size`
## Build, test, and lint commands
- Build: `cargo build`
- Run app: `cargo run`
- Run all tests: `cargo test`
- Run a single test: `cargo test <test_name>`
- Formatting check: `cargo fmt --check`
- Lint (if available in your toolchain): `cargo clippy -- -D warnings`
## High-level architecture
The application is a single-binary Rust CLI (`src/main.rs`) built as a staged pipeline:
1. Pick an input Excel file (`.xls`/`.xlsx`) via `rfd::FileDialog`.
2. Read workbook/sheet data with `calamine`.
3. Use terminal UI selectors (`dialoguer::Select`) to choose:
- sheet
- `Cikkszám` column
- `Url` column
- `img_sequence` column
4. Build input rows from sheet data (header row is row 0, data starts from row 1).
5. Download images from URLs using blocking `reqwest`.
6. Decode image bytes with `image` to extract dimensions; compute size from downloaded byte length.
7. Export results to a new workbook with `rust_xlsxwriter`, saved as `result_[uuid].xlsx` next to the input file.
Core helper responsibilities are split into functions in `main.rs`:
- selection helpers: `prompt_sheet_selection`, `prompt_column_selection`
- sheet parsing helpers: `extract_headers`, `collect_input_rows`, `cell_string`
- image metadata fetch: `fetch_image_metadata`
- output generation: `build_output_path`, `write_results_excel`
## Key conventions in this repository
- **Fail-fast error handling:** most failures print a clear `eprintln!` message and exit with status 1.
- **Column naming is contract-like:** output headers are currently written as:
- `Cikkszám`
- `Sqeuence` (keep this spelling unless intentionally changed)
- `Url`
- `Width (px)`
- `Height (px)`
- `Size (KB)`
- **Interactive flow is TUI-driven:** keep sheet/column selection cursor-based (`dialoguer::Select`), not numeric text prompts.
- **Output path pattern is fixed:** write result files as `result_[uuid].xlsx` in the same directory as the source workbook.

7
.gitignore vendored Normal file
View File

@ -0,0 +1,7 @@
.DS_Store
in/
# Added by cargo
/target

3252
Cargo.lock generated Normal file

File diff suppressed because it is too large Load Diff

13
Cargo.toml Normal file
View File

@ -0,0 +1,13 @@
[package]
name = "o8_pics_size"
version = "0.1.0"
edition = "2024"
[dependencies]
calamine = "0.31"
dialoguer = "0.11"
image = "0.25"
reqwest = { version = "0.12", default-features = false, features = ["blocking", "rustls-tls"] }
rfd = "0.15"
rust_xlsxwriter = "0.83"
uuid = { version = "1", features = ["v4"] }

View File

@ -1 +1,19 @@
# o8_pics_size
Run the app to open a file picker limited to Excel files (`.xls`, `.xlsx`).
After selecting a file, it reads the workbook sheet names and opens a terminal picker where you choose a sheet with arrow keys and Enter.
Then it reads the header row and asks you to map three columns using the same TUI picker:
- `Cikkszám`
- `Url`
- `img_sequence`
After mapping, it downloads each image from the selected `Url` column, reads image metadata, and writes a new Excel file next to the source workbook as `result_[uuid].xlsx`.
For testing, it currently processes only the first 100 non-empty URL rows.
Output columns:
- `Cikkszám`
- `Sqeuence`
- `Url`
- `Width`
- `Height`
- `Size` (KB)

354
src/main.rs Normal file
View File

@ -0,0 +1,354 @@
use std::{
path::{Path, PathBuf},
time::Duration
};
use calamine::{Data, Range, Reader, open_workbook_auto};
use dialoguer::{Select, theme::ColorfulTheme};
use image::GenericImageView;
use reqwest::blocking::Client;
use rust_xlsxwriter::Workbook;
use uuid::Uuid;
fn main() {
let Some(file) = rfd::FileDialog::new()
.add_filter("Excel files", &["xls", "xlsx"])
.pick_file()
else {
eprintln!("No Excel file selected.");
return;
};
println!("Selected Excel file: {}", file.display());
let mut workbook = match open_workbook_auto(&file) {
Ok(workbook) => workbook,
Err(err) => {
eprintln!("Failed to open workbook {}: {}", file.display(), err);
std::process::exit(1);
}
};
let sheet_names = workbook.sheet_names().to_vec();
if sheet_names.is_empty() {
eprintln!("No sheets found in workbook {}.", file.display());
std::process::exit(1);
}
let selected_sheet = prompt_sheet_selection(&sheet_names);
println!("\nSelected sheet: {}", selected_sheet);
let range = match workbook.worksheet_range(selected_sheet) {
Ok(range) => range,
Err(err) => {
eprintln!(
"Failed to read sheet '{}' in {}: {}",
selected_sheet,
file.display(),
err
);
std::process::exit(1);
}
};
let headers = extract_headers(&range);
if headers.is_empty() {
eprintln!("No header row found in sheet '{}'.", selected_sheet);
std::process::exit(1);
}
let cikkszam_idx = prompt_column_selection(&headers, "Select column for 'Cikkszám'", &[]);
let url_idx = prompt_column_selection(&headers, "Select column for 'Url'", &[cikkszam_idx]);
let img_sequence_idx = prompt_column_selection(
&headers,
"Select column for 'img_sequence'",
&[cikkszam_idx, url_idx],
);
println!(
"\nSelected columns:\n- Cikkszám: {} ({})\n- Url: {} ({})\n- img_sequence: {} ({})",
headers[cikkszam_idx],
column_label(cikkszam_idx),
headers[url_idx],
column_label(url_idx),
headers[img_sequence_idx],
column_label(img_sequence_idx)
);
let input_rows = collect_input_rows(&range, cikkszam_idx, img_sequence_idx, url_idx);
if input_rows.is_empty() {
eprintln!("No data rows with URL values were found.");
std::process::exit(1);
}
println!("Processing first {} rows for testing.", input_rows.len());
let client = match Client::builder()
.user_agent("o8_pics_size/0.1")
.timeout(Duration::from_secs(45))
.build()
{
Ok(client) => client,
Err(err) => {
eprintln!("Failed to initialize HTTP client: {}", err);
std::process::exit(1);
}
};
let mut output_rows = Vec::new();
let total = input_rows.len();
for (index, row) in input_rows.iter().enumerate() {
println!("[{}/{}] Fetching {}", index + 1, total, row.url);
let metadata = match fetch_image_metadata(&client, &row.url) {
Ok(metadata) => Some(metadata),
Err(err) => {
eprintln!("Failed to fetch '{}': {}", row.url, err);
None
}
};
output_rows.push(OutputRow {
cikkszam: row.cikkszam.clone(),
sequence: row.sequence.clone(),
url: row.url.clone(),
metadata,
});
}
let output_path = build_output_path(&file);
if let Err(err) = write_results_excel(&output_path, &output_rows) {
eprintln!(
"Failed to write output workbook {}: {}",
output_path.display(),
err
);
std::process::exit(1);
}
println!("\nCreated output workbook: {}", output_path.display());
}
fn prompt_sheet_selection(sheet_names: &[String]) -> &str {
let theme = ColorfulTheme::default();
let selection = match Select::with_theme(&theme)
.with_prompt("Select a sheet")
.items(sheet_names)
.default(0)
.interact()
{
Ok(index) => index,
Err(err) => {
eprintln!("Failed to select sheet: {}", err);
std::process::exit(1);
}
};
match sheet_names.get(selection) {
Some(name) => name,
None => {
eprintln!("Invalid sheet selection index: {}", selection);
std::process::exit(1);
}
}
}
fn extract_headers(range: &Range<Data>) -> Vec<String> {
let Some(first_row) = range.rows().next() else {
return Vec::new();
};
first_row
.iter()
.enumerate()
.map(|(index, cell)| {
let value = cell.to_string().trim().to_owned();
if value.is_empty() {
format!("Unnamed {}", column_label(index))
} else {
value
}
})
.collect()
}
fn prompt_column_selection(headers: &[String], prompt: &str, excluded_indices: &[usize]) -> usize {
let mut option_labels = Vec::new();
let mut option_indices = Vec::new();
for (index, header) in headers.iter().enumerate() {
if excluded_indices.contains(&index) {
continue;
}
option_labels.push(format!("{} ({})", header, column_label(index)));
option_indices.push(index);
}
if option_indices.is_empty() {
eprintln!("No selectable columns available for '{}'.", prompt);
std::process::exit(1);
}
let theme = ColorfulTheme::default();
let selected = match Select::with_theme(&theme)
.with_prompt(prompt)
.items(&option_labels)
.default(0)
.interact()
{
Ok(index) => index,
Err(err) => {
eprintln!("Failed to select column: {}", err);
std::process::exit(1);
}
};
option_indices[selected]
}
fn column_label(mut index: usize) -> String {
let mut label = String::new();
loop {
let remainder = index % 26;
label.insert(0, (b'A' + remainder as u8) as char);
index /= 26;
if index == 0 {
break;
}
index -= 1;
}
label
}
#[derive(Clone)]
struct InputRow {
cikkszam: String,
sequence: String,
url: String,
}
#[derive(Clone, Copy)]
struct ImageMetadata {
width: u32,
height: u32,
size_kb: f64,
}
struct OutputRow {
cikkszam: String,
sequence: String,
url: String,
metadata: Option<ImageMetadata>,
}
fn collect_input_rows(
range: &Range<Data>,
cikkszam_idx: usize,
sequence_idx: usize,
url_idx: usize,
) -> Vec<InputRow> {
let mut rows = Vec::new();
for row in range.rows().skip(1) {
let url = cell_string(row.get(url_idx));
if url.is_empty() {
continue;
}
rows.push(InputRow {
cikkszam: cell_string(row.get(cikkszam_idx)),
sequence: cell_string(row.get(sequence_idx)),
url,
});
}
rows
}
fn cell_string(cell: Option<&Data>) -> String {
match cell {
Some(value) => value.to_string().trim().to_owned(),
None => String::new(),
}
}
fn fetch_image_metadata(client: &Client, url: &str) -> Result<ImageMetadata, String> {
let response = client
.get(url)
.send()
.map_err(|err| format!("request failed: {}", err))?;
let response = response
.error_for_status()
.map_err(|err| format!("HTTP error: {}", err))?;
let bytes = response
.bytes()
.map_err(|err| format!("failed to read response body: {}", err))?;
let image = image::load_from_memory(&bytes)
.map_err(|err| format!("unable to decode image bytes: {}", err))?;
let (width, height) = image.dimensions();
let size_kb = bytes.len() as f64 / 1024.0;
Ok(ImageMetadata {
width,
height,
size_kb,
})
}
fn build_output_path(input_path: &Path) -> PathBuf {
input_path.with_file_name(format!("result_{}.xlsx", Uuid::new_v4()))
}
fn write_results_excel(path: &Path, rows: &[OutputRow]) -> Result<(), String> {
let mut workbook = Workbook::new();
let worksheet = workbook.add_worksheet();
worksheet
.write_string(0, 0, "Cikkszám")
.map_err(|err| err.to_string())?;
worksheet
.write_string(0, 1, "Sqeuence")
.map_err(|err| err.to_string())?;
worksheet
.write_string(0, 2, "Url")
.map_err(|err| err.to_string())?;
worksheet
.write_string(0, 3, "Width (px)")
.map_err(|err| err.to_string())?;
worksheet
.write_string(0, 4, "Height (px)")
.map_err(|err| err.to_string())?;
worksheet
.write_string(0, 5, "Size (KB)")
.map_err(|err| err.to_string())?;
for (index, row) in rows.iter().enumerate() {
let output_row = (index + 1) as u32;
worksheet
.write_string(output_row, 0, &row.cikkszam)
.map_err(|err| err.to_string())?;
worksheet
.write_string(output_row, 1, &row.sequence)
.map_err(|err| err.to_string())?;
worksheet
.write_string(output_row, 2, &row.url)
.map_err(|err| err.to_string())?;
if let Some(metadata) = row.metadata {
worksheet
.write_number(output_row, 3, metadata.width as f64)
.map_err(|err| err.to_string())?;
worksheet
.write_number(output_row, 4, metadata.height as f64)
.map_err(|err| err.to_string())?;
worksheet
.write_number(output_row, 5, metadata.size_kb)
.map_err(|err| err.to_string())?;
}
}
workbook.save(path).map_err(|err| err.to_string())
}