-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[test-utils] Pretty-print Nexus config parse error #6500
base: main
Are you sure you want to change the base?
Conversation
Presently, errors parsing the test Nexus config are printed using `fmt::Debug`, due to the use of `expect()`. The `Debug` format for `TomlError` is...pretty hard to read, since it looks like this: <details> <summary>EXTREMELY UGLY WALL OF TEXT, LOOK AT YOUR OWN RISK</summary> ``` --- STDERR: nexus-reconfigurator-execution omicron_zones::test::test_clean_up_cockroach_zones --- thread 'omicron_zones::test::test_clean_up_cockroach_zones' panicked at nexus/test-utils/src/lib.rs:186:10: failed to load config.test.toml: LoadError { path: "tests/config.test.toml", kind: Parse(Error { inner: Error { inner: TomlError { message: "missing field `region_snapshot_replacement_finish`", raw: Some("#\n# Oxide API: configuration file for test suite\n#\n\n[console]\n# Directory for static assets. Absolute path or relative to CWD.\nstatic_dir = \"tests/static\"\nsession_idle_timeout_minutes = 480 # 8 hours\nsession_absolute_timeout_minutes = 1440 # 24 hours\n\n# List of authentication schemes to support.\n[authn]\nschemes_external = [\"spoof\", \"session_cookie\"]\n\n#\n# NOTE: for the test suite, if mode = \"file\", the file path MUST be the sentinel\n# string \"UNUSED\". The actual path will be generated by the test suite for each\n# test.\n#\n[log]\nlevel = \"trace\"\nmode = \"file\"\nif_exists = \"fail\"\npath = \"UNUSED\"\n\n# Configuration for interacting with the timeseries database. This is overwritten\n# by the test suite once ClickHouse starts, with the actual address on which it\n# is listening.\n[timeseries_db]\naddress = \"[::1]:0\"\n\n# Tunable parameters used during tests\n[tunables]\n# Allow small subnets, so we can test IP address exhaustion easily / quickly\nmax_vpc_ipv4_subnet_prefix = 29\n\n[deployment]\n# Identifier for this instance of Nexus.\n# NOTE: The test suite always overrides this.\nid = \"913233fe-92a8-4635-9572-183f495429c4\"\nrack_id = \"c19a698f-c6f9-4a17-ae30-20d711b8f7dc\"\ntechport_external_server_port = 0\n\n# Nexus may need to resolve external hosts (e.g. to grab IdP metadata).\n# These are the DNS servers it should use.\nexternal_dns_servers = [\"1.1.1.1\", \"9.9.9.9\"]\n\n[deployment.dropshot_external]\n# NOTE: for the test suite, the port MUST be 0 (in order to bind to any\n# available port) because the test suite will be running many servers\n# concurrently.\nbind_address = \"127.0.0.1:0\"\nrequest_body_max_bytes = 8388608\n\n[deployment.dropshot_internal]\nbind_address = \"127.0.0.1:0\"\nrequest_body_max_bytes = 8388608\n\n#\n# NOTE: for the test suite, the internal DNS address will be replaced with one\n# that's started by the test runner.\n#\n[deployment.internal_dns]\ntype = \"from_subnet\"\nsubnet.net = \"fd00:1122:3344:0100::/56\"\n\n#\n# NOTE: for the test suite, the database URL will be replaced with one\n# appropriate for the database that's started by the test runner.\n#\n[deployment.database]\ntype = \"from_url\"\nurl = \"postgresql://[email protected]:0/omicron?sslmode=disable\"\n\n# Dendrite (dataplane daemon) API configuration\n[dendrite.switch0]\naddress = \"[::1]:12224\"\n\n[background_tasks]\ndns_internal.period_secs_config = 60\ndns_internal.period_secs_servers = 60\ndns_internal.period_secs_propagation = 60\ndns_internal.max_concurrent_server_updates = 5\ndns_external.period_secs_config = 60\ndns_external.period_secs_servers = 60\ndns_external.period_secs_propagation = 60\ndns_external.max_concurrent_server_updates = 5\nmetrics_producer_gc.period_secs = 60\n# How frequently we check the list of stored TLS certificates. This is\n# approximately an upper bound on how soon after updating the list of\n# certificates it will take _other_ Nexus instances to notice and stop serving\n# them (on a sunny day).\nexternal_endpoints.period_secs = 60\nnat_cleanup.period_secs = 30\nbfd_manager.period_secs = 30\n# How frequently to collect hardware/software inventory from the whole system\n# (even if we don't have reason to believe anything has changed).\ninventory.period_secs = 600\n# Maximum number of past collections to keep in the database\ninventory.nkeep = 3\n# Disable inventory collection altogether (for emergencies)\ninventory.disable = false\nphantom_disks.period_secs = 30\nphysical_disk_adoption.period_secs = 30\n# Disable automatic disk adoption to avoid interfering with tests.\nphysical_disk_adoption.disable = true\ndecommissioned_disk_cleaner.period_secs = 60\n# Disable disk decommissioning cleanup to avoid interfering with tests.\ndecommissioned_disk_cleaner.disable = true\nblueprints.period_secs_load = 100\nblueprints.period_secs_execute = 600\nblueprints.period_secs_collect_crdb_node_ids = 600\nsync_service_zone_nat.period_secs = 30\nswitch_port_settings_manager.period_secs = 30\nregion_replacement.period_secs = 30\n# The driver task should wake up frequently, something like every 10 seconds.\n# however, if it's this low it affects the test_omdb_success_cases test output.\n# keep this 30 seconds, so that the test shows \"triggered by an explicit\n# signal\" instead of \"triggered by a periodic timer firing\"\nregion_replacement_driver.period_secs = 30\ninstance_watcher.period_secs = 30\nservice_firewall_propagation.period_secs = 300\nv2p_mapping_propagation.period_secs = 30\nabandoned_vmm_reaper.period_secs = 60\nsaga_recovery.period_secs = 600\nlookup_region_port.period_secs = 60\n# The purpose of the `instance-updater` background task is to ensure that update\n# sagas are always *eventually* started for instances whose database state has\n# changed, even if the update saga was not started by the Nexus replica handling\n# an update from sled-agent. This is to ensure that updates are performed even\n# in cases where a Nexus crashes or otherwise disappears between when the\n# updated VMM and migration state is written to CRDB and when the resulting\n# update saga actually starts executing. However, we would prefer update sagas\n# to be executed in a timely manner, so for integration tests, we don't want to\n# *rely* on the instance-updater background task for running these sagas.\n#\n# Therefore, disable the background task during tests.\ninstance_updater.disable = true\ninstance_updater.period_secs = 60\nregion_snapshot_replacement_start.period_secs = 30\nregion_snapshot_replacement_garbage_collection.period_secs = 30\nregion_snapshot_replacement_step.period_secs = 30\n\n[default_region_allocation_strategy]\n# we only have one sled in the test environment, so we need to use the\n# `Random` strategy, instead of `RandomWithDistinctSleds`\ntype = \"random\"\n"), keys: [], span: Some(0..0) } } }) } note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace ``` </details> This commit changes the `load_test_config` function to match the result and use the `panic!` macro to format the error using `fmt::Display` if the test config file cannot be parsed. This should make debugging malformed configs way nicer. I previously made the same change for the `gateway-test-utils::setup::load_test_config` function in commit 3b89711
Should we also display the chain of errors? |
Agh, I thought this would do that...i always forget which format specifier I need to print the whole chain, is it |
I think |
Ah, yup, since the |
Yeah -- you can use |
Presently, errors parsing the test Nexus config are printed using
fmt::Debug
, due to the use ofexpect()
. TheDebug
format forTomlError
is...pretty hard to read, since it looks like this:EXTREMELY UGLY WALL OF TEXT, LOOK AT YOUR OWN RISK
This commit changes the
load_test_config
function to match the resultand use the
panic!
macro to format the error usingfmt::Display
ifthe test config file cannot be parsed. This should make debugging
malformed configs way nicer.
I previously made the same change for the
gateway-test-utils::setup::load_test_config
function in commit3b89711