Improved rhat diagnostic #3266

aleksgorica · 2024-02-06T21:44:37Z

Submission Checklist

Run unit tests: ./runTests.py src/test/unit
Run cpplint: make cpplint
Declare copyright holder and open-source license: see below

Summary

Solving issue #3269
Changes in compute_potential_scale_reduction, based on Vehtari
Rhat computattion moved in rhat function
Added function rank_transform
Changed tests values to match new output

Intended Effect

rank_transform: Computes normalized average ranks for draws. Transforming them to normal scores using inverse normal transformation and a fractional offset.

rhat: computes rhat like before, the computatoinal part is just moved in a new function.

compute_potential_scale_reduction: copies draws in matrix object, then computes bulk rhat and tail rhat and returns the maximum

How to Verify

Compare the results with Arviz rhat function
arviz

Side Effects

Documentation

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company):
Aleks Stepančič

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

bob-carpenter · 2024-02-06T21:53:21Z

Hi, @aleksgorica and welcome to the Stan project.

In terms of process, there should be an issue specifying the feature which this PR addresses. Please add an issue.

I would also suggest not removing the old functionality but instead just adding new functionality for the ranked version. That way, it won't break backward compatibility when it's added.

When it's ready to review for inclusion, please ping me and I can do it.

SteveBronder

Just leaving comments for now. Not sure if we want to modify this code directly or have a new function

SteveBronder · 2024-02-06T21:50:02Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+  int rows = draws.rows();
+  int cols = draws.cols();
+  int size = rows * cols;


just fyi Eigen types have use index values of Eigen::Index and std vectors use std::size_t

Generally for sizes that we know will not change we make them const

Suggested change

int rows = draws.rows();

int cols = draws.cols();

int size = rows * cols;

const Eigen::Index rows = draws.rows();

const Eigen::Index cols = draws.cols();

const Eigen::Index size = rows * cols;

(and change rest of places that use int to use Eigen::Index for storing eigen matrix sizes)

SteveBronder · 2024-02-06T21:55:36Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+  for (int col = 0; col < cols; ++col) {
+    for (int row = 0; row < rows; ++row) {
+      int index
+          = col * rows + row;  // Calculating linear index in column-major order
+      valueWithIndex[index] = {draws(row, col), index};
+    }
+  }


For column major Eigen matrices you can just iterate through each column by row using the operator(Eigen::Index). Also note we use snake_case for objects and CameCase for template parameters

Suggested change

for (int col = 0; col < cols; ++col) {

for (int row = 0; row < rows; ++row) {

int index

= col * rows + row; // Calculating linear index in column-major order

valueWithIndex[index] = {draws(row, col), index};

}

}

for (Eigen::Index i = 0; i < size; ++i) {

value_with_index[index] = {draws(i), index};

}

SteveBronder · 2024-02-06T21:58:19Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+    for (int k = i; k < j; ++k) {
+      int index = valueWithIndex[k].second;
+      int row = index % rows;  // Adjusting row index for column-major order
+      int col = index / rows;  // Adjusting column index for column-major order
+      double p = (avgRank - 3.0 / 8.0) / (size - 2.0 * 3.0 / 8.0 + 1.0);
+      rankMatrix(row, col) = boost::math::quantile(dist, p);
+    }


You should be able to do the index without calculating rows / cols here like above

SteveBronder · 2024-02-06T21:59:12Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+  int rows = draws.rows();
+  int cols = draws.cols();
+  int size = rows * cols;
+  Eigen::MatrixXd rankMatrix = Eigen::MatrixXd::Zero(rows, cols);


We like to declare things as near as possible to where they are used so I'd move this down to right before the if

SteveBronder · 2024-02-06T22:01:19Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+    int j = i;
+    double sumRanks = 0;
+    int count = 0;
+
+    while (j < size && valueWithIndex[j].first == valueWithIndex[i].first) {
+      sumRanks += j + 1;  // Rank starts from 1
+      ++j;
+      ++count;
+    }


Right now this while loop while always go off at least once, can you start j at j = i + 1 etc to avoid this? Then that while loop only happens for cases of duplicates

SteveBronder · 2024-02-06T22:02:09Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+ * Computes square root of marginal posterior variance of the estimand by
+ * weigted average of within-chain variance W and between-chain variance B.


Suggested change

* Computes square root of marginal posterior variance of the estimand by

* weigted average of within-chain variance W and between-chain variance B.

* Computes square root of marginal posterior variance of the estimand by the

* weighted average of within-chain variance W and between-chain variance B.

SteveBronder · 2024-02-06T22:14:03Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+  for (int chain = 0; chain < num_chains; ++chain) {
+    boost::accumulators::accumulator_set<
+        double, boost::accumulators::stats<boost::accumulators::tag::mean,
+                                           boost::accumulators::tag::variance>>
+        acc_draw;
+    for (int n = 0; n < num_draws; ++n) {
+      acc_draw(draws(n, chain));
+    }
+    chain_mean(chain) = boost::accumulators::mean(acc_draw);
+    acc_chain_mean(chain_mean(chain));
+    chain_var(chain)
+        = boost::accumulators::variance(acc_draw) * unbiased_var_scale;
+  }


It doesn't look like you need an online way to update and view the current mean and variance. You can calculate the mean for each chain via draws.colwise().mean() and draws.array().mean() for the overall mean. Then a loop for the chain variance calculation.

Now that I'm reading this again should that be named `acc_chain_variance because of line 105? I've never used boost accumulators before so not sure if I'm following all the way here

…y of calculating variance and averages

…cale_reduction

…wrhat

SteveBronder · 2024-02-27T16:28:19Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+ *
+ */
+
+Eigen::MatrixXd rank_transform(const Eigen::MatrixXd& draws) {


For the error on jenkins. This makes it so that there are not multiple definitions for different translation units

Suggested change

Eigen::MatrixXd rank_transform(const Eigen::MatrixXd& draws) {

inline Eigen::MatrixXd rank_transform(const Eigen::MatrixXd& draws) {

…wrhat

stan-buildbot · 2024-02-29T00:48:44Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
arma/arma.stan	0.22	0.2	1.13	11.26% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.01	0.01	0.92	-8.7% slower
gp_regr/gen_gp_data.stan	0.02	0.02	0.97	-2.6% slower
gp_regr/gp_regr.stan	0.11	0.11	0.98	-1.79% slower
sir/sir.stan	80.14	78.73	1.02	1.76% faster
irt_2pl/irt_2pl.stan	4.27	3.94	1.08	7.62% faster
eight_schools/eight_schools.stan	0.06	0.05	1.05	4.36% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.26	0.26	1.01	1.3% faster
pkpd/one_comp_mm_elim_abs.stan	18.89	18.45	1.02	2.3% faster
garch/garch.stan	0.49	0.46	1.07	6.46% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.87	2.83	1.01	1.44% faster
arK/arK.stan	1.67	1.65	1.01	1.07% faster
gp_pois_regr/gp_pois_regr.stan	2.59	2.5	1.03	3.35% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	9.41	9.21	1.02	2.2% faster
performance.compilation	176.71	180.14	0.98	-1.94% slower
Mean result: 1.0212747855344297

Jenkins Console Log
Blue Ocean
Commit hash: f70fb84db501edcbf822691c02664c43b76287b2

Machine information

No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focal

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 80
On-line CPU(s) list: 0-79
Thread(s) per core: 2
Core(s) per socket: 20
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
Stepping: 4
CPU MHz: 3351.692
CPU max MHz: 3700.0000
CPU min MHz: 1000.0000
BogoMIPS: 4800.00
Virtualization: VT-x
L1d cache: 1.3 MiB
L1i cache: 1.3 MiB
L2 cache: 40 MiB
L3 cache: 55 MiB
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79
Vulnerability Gather data sampling: Mitigation; Microcode
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Retbleed: Mitigation; IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT vulnerable
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke md_clear flush_l1d arch_capabilities

G++:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Clang:
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

SteveBronder

I think this is good! The suggested changes mostly cover some C++ things, some Qs about the codes math, and handling some edge cases.

As a first PR this is very good so far!

SteveBronder · 2024-02-29T19:48:56Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+ */
+inline double compute_potential_scale_reduction_rank(
+    std::vector<const double*> draws, std::vector<size_t> sizes) {
+  int num_chains = sizes.size();


std::vector's size type is std::size_t

Suggested change

int num_chains = sizes.size();

std::size_t num_chains = sizes.size();

or use auto

Why is only the first size checked to see if it is zero and then we return a NaN? If one chain failed we should still be able to use information from all of the other chains. Looking at the rest of the code, unless there is a math reason to not ignore zero sized chains I think we should just prune them

std::vector<const double*> nonzero_chains_begins; std::vector<std::size_t> nonzero_chain_sizes; for (int i = 0; i < chain_sizes.size(); ++i) { if (!chain_sizes[i]) { nonzero_chains_begin.push_back(chain_begins[i]); nonzero_chains_sizes.push_back(chain_sizes[i]); } } if (!nonzero_chains_sizes.size()) { return std::numeric_limits<double>::quiet_NaN(); }

SteveBronder · 2024-02-29T19:56:09Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+  size_t num_draws = sizes[0];
+  if (num_draws == 0) {
+    return std::numeric_limits<double>::quiet_NaN();
+  }


Also should num_draws be min_num_draws since it's the minimum number of draws received from each chain?

SteveBronder · 2024-02-29T20:04:37Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+ * @return potential scale reduction for the specified parameter
+ */
+inline double compute_potential_scale_reduction_rank(
+    std::vector<const double*> draws, std::vector<size_t> sizes) {


I'd rename these to be a little more clear. begin() is a function in standard library containers that means "an iterator pointing to the first element of a container" so using begin in the name here will signal to people "these pointers are the pointers to the first element of each chain"

Suggested change

std::vector<const double*> draws, std::vector<size_t> sizes) {

std::vector<const double*> chain_begins, std::vector<size_t> chain_sizes) {

SteveBronder · 2024-02-29T20:15:20Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+  if (are_all_const) {
+    // If all chains are constant then return NaN
+    // if they all equal the same constant value
+    if (init_draw.isApproxToConstant(init_draw(0))) {
+      return std::numeric_limits<double>::quiet_NaN();
+    }
+  }


But it is fine if each chain is constant, but each one is a different value? tbc I'm asking because idk if that is how the paper is written or not. I suppose this makes sense in the case of many short chains

Yes, you are correct. The current implementation fails if different chains are constant. For example, C1: [1, 1, 1]; C2: [2, 2, 2] would have a within-variance of 0, and the rhat function would return inf due to division by zero. I think the best way to correct this is to check if there exists a non-constant chain.

Doing something intelligent around constant chains would be a big improvement on our current NaN behavior. But I'm not sure what that is as there's not a number that makes sense as the ESS.

If all chains have the same constant value, we can't make the difference between all chains being stuck or variable actually being constant (e.g. diagonal of correlation matrix) as Stan doesn't tag the variables. In that case diagnostics in R return NA. If the chains have different constant values, then the variable can't be a true constant, and Rhat Inf is fine.

SteveBronder · 2024-02-29T20:30:31Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+    }
+  }
+
+  Eigen::MatrixXd matrix(num_draws, num_chains);


Suggested change

Eigen::MatrixXd matrix(num_draws, num_chains);

Eigen::MatrixXd draws_matrix(num_draws, num_chains);

SteveBronder · 2024-02-29T21:32:50Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+  double rhat_tail = rhat(rank_transform(
+      (matrix.array() - math::quantile(matrix.reshaped(), 0.5)).abs()));
+
+  return std::max(rhat_bulk, rhat_tail);


Question for @avehtari

Do we want to just return the max or should we return a pair so the user can see the bulk and tail rhats?

It is useful for the user to know both. A diagnostic message could be simplified by reporting if the max of these is too low, but otherwise I would prefer that both would be available for the user. Making them both available does change the io via csv and changing csv structures need to be considered carefully

Okay then @aleksgorica can you have this return back an std::pair?

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

SteveBronder · 2024-02-29T21:39:58Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+inline double compute_split_potential_scale_reduction_rank(
+    std::vector<const double*> draws, std::vector<size_t> sizes) {


We want these arguments to come in as constant references. As written this will make a hard copy of the input vectors when you call this function. Making the arguments references (&) means the function will just use the already existing object without making a copy and const means the arguments will be constant in the function (i.e. we will not modify them)

Suggested change

inline double compute_split_potential_scale_reduction_rank(

std::vector<const double*> draws, std::vector<size_t> sizes) {

inline double compute_split_potential_scale_reduction_rank(

const std::vector<const double*>& draws, const std::vector<size_t>& sizes) {

We want containers (like std::vector or Eigen::MatrixXd or std::map) to be passed by constant reference. Small types such as double, int, std::size_t etc. can be passed by value as making copies of them is trivial. Happy to explain this more if you like but don't want to overload you with info. A nice place to read about things like this is Scott Meyers "Effective Modern C++" which if you google should be easy to find a free copy of online.

This comment applies to all the function signatures you added here

SteveBronder · 2024-02-29T21:48:14Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+ * Current implementation assumes draws are stored in contiguous
+ * blocks of memory.  Chains are trimmed from the back to match the
+ * length of the shortest chain.


You use split_chains which also assumes each chain is the same length

SteveBronder · 2024-02-29T21:54:34Z

src/stan/analyze/mcmc/compute_potential_scale_reduction.hpp

+  double half = num_draws / 2.0;
+  std::vector<size_t> half_sizes(2 * num_chains, std::floor(half));


Just to make it more clear you are using floating point division and then taking the floor to get the index

Suggested change

double half = num_draws / 2.0;

std::vector<size_t> half_sizes(2 * num_chains, std::floor(half));

std::size_thalf = std::floor(num_draws / 2.0);

std::vector<size_t> half_sizes(2 * num_chains, half);

stan-buildbot · 2024-03-25T16:13:20Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
arma/arma.stan	0.25	0.23	1.1	9.39% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.01	0.02	0.62	-60.65% slower
gp_regr/gen_gp_data.stan	0.02	0.02	0.98	-1.8% slower
gp_regr/gp_regr.stan	0.12	0.13	0.93	-7.72% slower
sir/sir.stan	87.69	85.71	1.02	2.25% faster
irt_2pl/irt_2pl.stan	4.56	4.42	1.03	3.16% faster
eight_schools/eight_schools.stan	0.06	0.07	0.83	-20.07% slower
pkpd/sim_one_comp_mm_elim_abs.stan	0.28	0.26	1.08	7.16% faster
pkpd/one_comp_mm_elim_abs.stan	20.41	20.05	1.02	1.78% faster
garch/garch.stan	0.59	0.51	1.15	13.25% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan	3.38	2.99	1.13	11.56% faster
arK/arK.stan	1.83	1.75	1.05	4.45% faster
gp_pois_regr/gp_pois_regr.stan	2.82	2.66	1.06	5.78% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	10.38	10.44	0.99	-0.55% slower
performance.compilation	209.61	208.84	1.0	0.37% faster
Mean result: 1.0006665985989671

Jenkins Console Log
Blue Ocean
Commit hash: 728ec0a53bd4cce07bf159a426e13512de0b0263

Machine information

No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focal

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 80
On-line CPU(s) list: 0-79
Thread(s) per core: 2
Core(s) per socket: 20
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
Stepping: 4
CPU MHz: 2400.000
CPU max MHz: 3700.0000
CPU min MHz: 1000.0000
BogoMIPS: 4800.00
Virtualization: VT-x
L1d cache: 1.3 MiB
L1i cache: 1.3 MiB
L2 cache: 40 MiB
L3 cache: 55 MiB
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79
Vulnerability Gather data sampling: Mitigation; Microcode
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Retbleed: Mitigation; IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT vulnerable
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke md_clear flush_l1d arch_capabilities

G++:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Clang:
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

stan-buildbot · 2024-04-07T15:22:14Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
arma/arma.stan	0.2	0.24	0.86	-16.9% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.01	0.01	1.09	8.21% faster
gp_regr/gen_gp_data.stan	0.02	0.02	1.08	7.31% faster
gp_regr/gp_regr.stan	0.11	0.1	1.06	5.56% faster
sir/sir.stan	77.99	75.3	1.04	3.45% faster
irt_2pl/irt_2pl.stan	3.85	3.74	1.03	2.7% faster
eight_schools/eight_schools.stan	0.05	0.05	1.03	2.73% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.25	0.24	1.01	1.24% faster
pkpd/one_comp_mm_elim_abs.stan	18.27	17.61	1.04	3.62% faster
garch/garch.stan	0.46	0.44	1.04	3.64% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.79	2.75	1.02	1.63% faster
arK/arK.stan	1.63	1.59	1.02	2.16% faster
gp_pois_regr/gp_pois_regr.stan	2.54	2.45	1.04	3.65% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	9.07	8.92	1.02	1.65% faster
performance.compilation	179.41	180.07	1.0	-0.37% slower
Mean result: 1.0234405050935638

Jenkins Console Log
Blue Ocean
Commit hash: 0fcf10855f923eb24c8e9958f1f19fde97572810

Machine information

No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focal

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 80
On-line CPU(s) list: 0-79
Thread(s) per core: 2
Core(s) per socket: 20
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
Stepping: 4
CPU MHz: 2400.000
CPU max MHz: 3700.0000
CPU min MHz: 1000.0000
BogoMIPS: 4800.00
Virtualization: VT-x
L1d cache: 1.3 MiB
L1i cache: 1.3 MiB
L2 cache: 40 MiB
L3 cache: 55 MiB
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79
Vulnerability Gather data sampling: Mitigation; Microcode
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Retbleed: Mitigation; IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT vulnerable
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke md_clear flush_l1d arch_capabilities

G++:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Clang:
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

SteveBronder · 2024-04-09T16:01:51Z

@aleksgorica this code all looks good! The only thing to fix now is the API. We want to keep all the old code and functions the same for backwards compatibility and have all of this in new functions. For instance compute_potential_scale_reduction() should have the same results as previously.

So the new code you have in compute_potential_scale_reduction_rank etc. is good and you can revert your changes to compute_potential_scale_reduction. New code in the future can use your compute_potential_scale_reduction_rank for the new estimate.

Once that is changed then I think you are good to merge!

aleksgorica · 2024-04-17T14:34:14Z

@aleksgorica this code all looks good! The only thing to fix now is the API. We want to keep all the old code and functions the same for backwards compatibility and have all of this in new functions. For instance compute_potential_scale_reduction() should have the same results as previously.

So the new code you have in compute_potential_scale_reduction_rank etc. is good and you can revert your changes to compute_potential_scale_reduction. New code in the future can use your compute_potential_scale_reduction_rank for the new estimate.

Once that is changed then I think you are good to merge!

Okay, I hope I have understood correctly, I have just reverted the changes in the original non-rank functions to the previous code. However, I would like to know why that is necessary since I believe the new code yields the same results as the previous code according to the tests, and it is also a bit better written.

stan-buildbot · 2024-04-17T17:00:55Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
arma/arma.stan	0.26	0.19	1.35	25.96% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.01	0.01	1.08	7.44% faster
gp_regr/gen_gp_data.stan	0.02	0.02	1.04	4.04% faster
gp_regr/gp_regr.stan	0.11	0.1	1.04	3.42% faster
sir/sir.stan	78.12	75.68	1.03	3.12% faster
irt_2pl/irt_2pl.stan	3.87	3.93	0.99	-1.51% slower
eight_schools/eight_schools.stan	0.05	0.05	1.0	0.17% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.25	0.25	0.99	-0.58% slower
pkpd/one_comp_mm_elim_abs.stan	18.03	18.27	0.99	-1.28% slower
garch/garch.stan	0.45	0.46	0.98	-2.51% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.78	2.83	0.98	-1.99% slower
arK/arK.stan	1.64	1.69	0.97	-3.19% slower
gp_pois_regr/gp_pois_regr.stan	2.5	2.61	0.96	-4.57% slower
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	9.11	9.43	0.97	-3.52% slower
performance.compilation	178.35	179.15	1.0	-0.45% slower
Mean result: 1.0234807232212786

Jenkins Console Log
Blue Ocean
Commit hash: ffae22a76851e3b49ca43f31234eeada692f1d9b

Machine information

No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focal

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 80
On-line CPU(s) list: 0-79
Thread(s) per core: 2
Core(s) per socket: 20
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
Stepping: 4
CPU MHz: 2400.000
CPU max MHz: 3700.0000
CPU min MHz: 1000.0000
BogoMIPS: 4800.00
Virtualization: VT-x
L1d cache: 1.3 MiB
L1i cache: 1.3 MiB
L2 cache: 40 MiB
L3 cache: 55 MiB
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79
Vulnerability Gather data sampling: Mitigation; Microcode
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Retbleed: Mitigation; IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT vulnerable
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke md_clear flush_l1d arch_capabilities

G++:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Clang:
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

bob-carpenter · 2024-04-17T17:59:08Z

Okay, I hope I have understood correctly, I have just reverted the changes in the original non-rank functions to the previous code. However, I would like to know why that is necessary since I believe the new code yields the same results.

There's no need to keep the old code, but we need to keep the old interfaces so we don't break anyone's existing code. The new code should be doing ranked R-hat, right? That won't provide the same answers. If you wrote general enough code to do both, you can have the old interface delegate to the new code. Then we can rewrite the interfaces to use the new code and get rid of the old code. @SteveBronder will know more about the specifics here.

SteveBronder · 2024-04-19T16:45:16Z

@bob-carpenter I rewrote this to dispatch to the old cold from the previous API

@aleksgorica there's one signature missing for split_potential_scale_reduction_rank so it matches the API of split_potential_scale_reduction. See below. Can you add that signature and have the current split_potential_scale_reduction signature call your new split_potential_scale_reduction_rank returning bulk rhat?

https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/chains.hpp#L219

stan-buildbot · 2024-04-19T20:09:51Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
arma/arma.stan	0.22	0.2	1.09	8.35% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.01	0.01	0.95	-5.09% slower
gp_regr/gen_gp_data.stan	0.02	0.02	0.99	-0.75% slower
gp_regr/gp_regr.stan	0.12	0.11	1.06	5.76% faster
sir/sir.stan	81.19	82.66	0.98	-1.81% slower
irt_2pl/irt_2pl.stan	4.15	4.03	1.03	2.92% faster
eight_schools/eight_schools.stan	0.05	0.05	1.01	0.63% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.25	0.25	1.01	0.65% faster
pkpd/one_comp_mm_elim_abs.stan	18.61	18.33	1.01	1.47% faster
garch/garch.stan	0.48	0.47	1.03	2.79% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.91	2.83	1.03	2.6% faster
arK/arK.stan	1.66	1.67	0.99	-0.59% slower
gp_pois_regr/gp_pois_regr.stan	2.6	2.61	1.0	-0.35% slower
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	9.39	9.48	0.99	-0.94% slower
performance.compilation	190.84	190.77	1.0	0.04% faster
Mean result: 1.011567169879977

Jenkins Console Log
Blue Ocean
Commit hash: d8ac1c67d08c8b579e1b20ed5747abacce87c6fc

Machine information

No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focal

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 80
On-line CPU(s) list: 0-79
Thread(s) per core: 2
Core(s) per socket: 20
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
Stepping: 4
CPU MHz: 2400.000
CPU max MHz: 3700.0000
CPU min MHz: 1000.0000
BogoMIPS: 4800.00
Virtualization: VT-x
L1d cache: 1.3 MiB
L1i cache: 1.3 MiB
L2 cache: 40 MiB
L3 cache: 55 MiB
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79
Vulnerability Gather data sampling: Mitigation; Microcode
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Retbleed: Mitigation; IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT vulnerable
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke md_clear flush_l1d arch_capabilities

G++:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Clang:
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

This reverts commit f12f259.

SteveBronder · 2024-04-22T15:07:36Z

Actually I reverted my old PR. Changing all the old functions to use the new one causes cmdstan tests to fail.

Let's have the _rank versions like you made here. We just need one last signature for

double split_potential_scale_reduction(
      const Eigen::Matrix<Eigen::VectorXd, Dynamic, 1>& samples)

Then we are good! After this is in we can then change cmdstan etc. to use the new ranked version. I think this is better because we need to change the output anyway to report back both the bulk and tail rhat

stan-buildbot · 2024-04-22T19:42:10Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
arma/arma.stan	0.2	0.19	1.07	6.13% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.01	0.01	1.12	10.43% faster
gp_regr/gen_gp_data.stan	0.02	0.02	1.1	8.7% faster
gp_regr/gp_regr.stan	0.11	0.1	1.06	5.37% faster
sir/sir.stan	77.58	75.31	1.03	2.92% faster
irt_2pl/irt_2pl.stan	3.76	3.75	1.0	0.26% faster
eight_schools/eight_schools.stan	0.05	0.05	1.02	1.8% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.25	0.24	1.04	3.67% faster
pkpd/one_comp_mm_elim_abs.stan	17.97	17.57	1.02	2.27% faster
garch/garch.stan	0.45	0.45	1.0	-0.12% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.77	2.73	1.02	1.55% faster
arK/arK.stan	1.63	1.6	1.02	2.0% faster
gp_pois_regr/gp_pois_regr.stan	2.48	2.5	0.99	-0.79% slower
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	9.05	9.02	1.0	0.4% faster
performance.compilation	173.78	176.46	0.98	-1.54% slower
Mean result: 1.030804841038834

Jenkins Console Log
Blue Ocean
Commit hash: 934e17704b703a32c4c29f9ab5ae1913c4a58a57

Machine information

No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focal

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 80
On-line CPU(s) list: 0-79
Thread(s) per core: 2
Core(s) per socket: 20
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
Stepping: 4
CPU MHz: 2400.000
CPU max MHz: 3700.0000
CPU min MHz: 1000.0000
BogoMIPS: 4800.00
Virtualization: VT-x
L1d cache: 1.3 MiB
L1i cache: 1.3 MiB
L2 cache: 40 MiB
L3 cache: 55 MiB
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79
Vulnerability Gather data sampling: Mitigation; Microcode
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Retbleed: Mitigation; IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT vulnerable
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke md_clear flush_l1d arch_capabilities

G++:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Clang:
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

aleksgorica · 2024-04-22T20:38:43Z

Actually I reverted my old PR. Changing all the old functions to use the new one causes cmdstan tests to fail.

Let's have the _rank versions like you made here. We just need one last signature for
double split_potential_scale_reduction(
      const Eigen::Matrix<Eigen::VectorXd, Dynamic, 1>& samples)
Then we are good! After this is in we can then change cmdstan etc. to use the new ranked version. I think this is better because we need to change the output anyway to report back both the bulk and tail rhat

Ok, but why should we even keep double split_potential_scale_reduction( const Eigen::Matrix<Eigen::VectorXd, Dynamic, 1>& samples) or add double split_potential_scale_reduction_rank( const Eigen::Matrix<Eigen::VectorXd, Dynamic, 1>& samples), when it is declared private and there are no written tests for the function and no known references? Would we break backward compatibility by removing the function?

SteveBronder · 2024-04-22T21:13:38Z

Oh, your right I didn't see that it was a private member that is never called. Let's leave it for now but idt you need to write a rank version for that one.

SteveBronder · 2024-04-22T21:19:50Z

@aleksgorica you should have received an email that adds you to the stan project :) clicking on the link in that email from github should then allow you to press the "merge pull request" button

aleksgorica · 2024-04-22T22:10:14Z

Thank you all for accepting me to Stan project :). I had really great time working on the pull request. Also thank you @SteveBronder and @bob-carpenter for mentoring.

But github doesn't allow me to merge pull request.

bob-carpenter · 2024-04-23T15:07:48Z

I just verified that Steve gave you permission to merge. Can you verify that works on your end by merging this?

Thanks, and welcome to the Stan team!

SteveBronder · 2024-04-23T15:31:35Z

Did you click on the link in the email that was sent to you from github? After you click that and refresh the page you should have permission to merge

aleksgorica · 2024-04-23T15:57:31Z

Yes, I clicked multiple times, now it always redirects me to https://github.com/stan-dev/stan. But it still does not allow me to merge. If I go to https://github.com/settings/organizations, there is written that I am outside collaborator on 1 repository for Stan.

SteveBronder · 2024-04-23T16:10:09Z

Sorry you should have just gotten a new email that should give you the right permissions

aleksgorica and others added 7 commits January 30, 2024 11:06

newrhat

88173cc

resolved some of pr1

98450d7

comments deleted

b09f512

Test changed

dcd21f2

chains_test modified

c20ce61

Merge commit 'a0154340ce1f195de01839075adad25e10bf28d5' into HEAD

0e8379f

[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1

5630c51

SteveBronder reviewed Feb 6, 2024

View reviewed changes

aleksgorica and others added 10 commits February 11, 2024 14:22

Eigen::Index; index without calculating rows, cols; removed online wa…

50617bd

…y of calculating variance and averages

Merge commit 'b6d010fa1dab84d8910b382636c8707c4104fd2e' into HEAD

7c5880f

[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1

dc130f8

duplicated functions and test for rank version of compute_potential_s…

3433d9c

…cale_reduction

Merge branch 'newrhat' of https://github.com/aleksgorica/stan into ne…

4b58653

…wrhat

Merge commit '41fd137d7cf89db794210142d6c61f07cf9f0b0a' into HEAD

3aeab3f

[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1

30333ce

test chains_test changed

77d875f

Merge branch 'newrhat' of https://github.com/aleksgorica/stan into ne…

6221540

…wrhat

[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1

da10f8f

SteveBronder reviewed Feb 27, 2024

View reviewed changes

aleksgorica and others added 4 commits February 28, 2024 19:10

added test values from arviz

c3b1101

Merge branch 'newrhat' of https://github.com/aleksgorica/stan into ne…

263ac22

…wrhat

Merge commit '348716b22e624b98000a6ee4a4389603da861493' into HEAD

b6ef4bb

[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1

51ad447

SteveBronder requested changes Feb 29, 2024

View reviewed changes

aleksgorica and others added 3 commits March 25, 2024 14:35

smaller changes for comments in pull request

71bfbcf

Merge commit '951ce92cee114881c9baa556bb63cebffb9f7772' into HEAD

c371c21

[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1

385c80b

[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1

971aed7

aleksgorica and others added 3 commits April 17, 2024 16:26

reverting nonrank functions

61c4c6c

Merge commit 'c582de8f5a59e721df0c2786830a8c8b921e2961' into HEAD

9dae7a0

[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1

51db135

SteveBronder and others added 3 commits April 19, 2024 12:40

update so scale_reduction calls scale_reduction_rank

f12f259

Merge commit '634034deb3abd6314d980c1aab083f64269f4019' into HEAD

3cb8e97

[Jenkins] auto-formatting by clang-format version 10.0.0-4ubuntu1

da5bb8d

Revert "update so scale_reduction calls scale_reduction_rank"

b3631be

This reverts commit f12f259.

SteveBronder approved these changes Apr 22, 2024

View reviewed changes

aleksgorica merged commit f93a559 into stan-dev:develop Apr 23, 2024
3 checks passed

mitzimorris mentioned this pull request Apr 23, 2024

Update stansummary to use new rhat diagnostic stan-dev/cmdstan#1263

Open

avehtari mentioned this pull request May 30, 2024

cmdstan docs for cmdstan_diagnose describe incorrect version of R-hat stan-dev/docs#600

Open

mitzimorris mentioned this pull request Jul 12, 2024

use improved Rhat to implement ESS-bulk and ESS-tail #3299

Open

		* Computes square root of marginal posterior variance of the estimand by
		* weigted average of within-chain variance W and between-chain variance B.

	Eigen::MatrixXd rank_transform(const Eigen::MatrixXd& draws) {
	inline Eigen::MatrixXd rank_transform(const Eigen::MatrixXd& draws) {

	int num_chains = sizes.size();
	std::size_t num_chains = sizes.size();

	std::vector<const double*> draws, std::vector<size_t> sizes) {
	std::vector<const double*> chain_begins, std::vector<size_t> chain_sizes) {

	Eigen::MatrixXd matrix(num_draws, num_chains);
	Eigen::MatrixXd draws_matrix(num_draws, num_chains);

		inline double compute_split_potential_scale_reduction_rank(
		std::vector<const double*> draws, std::vector<size_t> sizes) {

		double half = num_draws / 2.0;
		std::vector<size_t> half_sizes(2 * num_chains, std::floor(half));

Improved rhat diagnostic #3266

Improved rhat diagnostic #3266

Conversation

aleksgorica commented Feb 6, 2024 • edited Loading

Submission Checklist

Summary

Intended Effect

How to Verify

Side Effects

Documentation

Copyright and Licensing

bob-carpenter commented Feb 6, 2024

SteveBronder left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stan-buildbot commented Feb 29, 2024

SteveBronder left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stan-buildbot commented Mar 25, 2024

stan-buildbot commented Apr 7, 2024

SteveBronder commented Apr 9, 2024

aleksgorica commented Apr 17, 2024

stan-buildbot commented Apr 17, 2024

bob-carpenter commented Apr 17, 2024

SteveBronder commented Apr 19, 2024

stan-buildbot commented Apr 19, 2024

SteveBronder commented Apr 22, 2024

stan-buildbot commented Apr 22, 2024

aleksgorica commented Apr 22, 2024

SteveBronder commented Apr 22, 2024

SteveBronder commented Apr 22, 2024

aleksgorica commented Apr 22, 2024

bob-carpenter commented Apr 23, 2024

SteveBronder commented Apr 23, 2024

aleksgorica commented Apr 23, 2024

SteveBronder commented Apr 23, 2024

aleksgorica commented Feb 6, 2024 •

edited

Loading