  
  [1X3 [33X[0;0YDescription of the algorithm[133X[101X
  
  
  [1X3.1 [33X[0;0YElementary version[133X[101X
  
  
  [1X3.1-1 [33X[0;0YWhat it does?[133X[101X
  
  [33X[0;0YIn the simplest possible terms, we are given a pair of matrices [23XP[123X and [23XQ[123X with
  orthogonal  rows,  [23XPQ^T=0[123X.  The  matrices  have  entries  in  a finite field
  [23XF=\mathop{\rm GF}(q)[123X, where [23Xq[123X is a power of a prime. The goal is to find the
  smallest weight of a non-zero vector [23Xc[123X over the same field [23XF[123X, such that [23Xc[123X be
  orthogonal  with  the  rows  of [23XP[123X, [23XPc^T=0[123X, and linearly independent from the
  rows of [23XQ[123X.[133X
  
  
  [1X3.1-2 [33X[0;0YThe algorithm[133X[101X
  
  [33X[0;0YWe  first  construct  a  generator  matrix  [23XG[123X whose rows form a basis of the
  [23XF[123X-linear  space  of all vectors orthogonal to the rows of [23XP[123X. At each step, a
  random  permutation  [23XS[123X  is  generated and applied to the columns of [23XG[123X. Then,
  Gauss'  elimination  with  back substitution renders the resulting matrix to
  the  reduced row echelon form, after which the inverse permutation [23XS^{-1}[123X is
  applied  to  the columns. Rows of the resulting matrix [23XG_S[123X that are linearly
  independent  from the rows of [23XQ[123X are considered as candidates for the minimum
  weight  vectors.  Thus,  after [23XN[123X steps, we are getting an upper bound on the
  distance which is improving with increasing [23XN[123X.[133X
  
  
  [1X3.1-3 [33X[0;0YIntuition[133X[101X
  
  [33X[0;0YThe  intuition  is  that  each  row of [23XG_S[123X is guaranteed to contain at least
  [10Xrank[110X[23X(G_S)-1[123X  zeros.  Thus,  we are sampling mostly lower-weight vectors from
  the  linear  space  orthogonal  to the rows of [23XP[123X. Further, it is easy to see
  that any vector obtained this way is [13Xirreducible[113X [DKP15], i.e., it cannot be
  decomposed  into  a  pair  of  zero-syndrome  vectors  with  non-overlapping
  supports.[133X
  
  [33X[0;0YFurthermore,  the  eventual  convergence  is  guaranteed.  Indeed, if [23Xc[123X is a
  minimum-weight  codeword  of weight [23Xd[123X, consider a permutation [23XS[123X which places
  one  position  from  its  support  into  the  1st  column, and the remaining
  positions  into  the  last  [23Xd-1[123X  columns.  Vector  [23Xc[123X being the lowest-weight
  non-trivial vector, no pivot column may be in the block of last [23Xd-1[123X columns.
  This  guarantees  that  vector  [23Xc[123X is obtained as the first row of [23XG_S[123X. (This
  argument is adapted to degenerate quantum codes from [CGLN21]).[133X
  
  
  [1X3.1-4 [33X[0;0YCSS version of the algorithm[133X[101X
  
  [33X[0;0YThe  described  version  of  the  algorithm  is  implemented in the function
  [10XDistRandCSS[110X  ([14X4.1[114X).  It  applies to the case of Calderbank-Shor-Steane (CSS)
  codes,  where  the  matrices  [23XP=H_X[123X  and  [23XQ=H_Z[123X are called the CSS generator
  matrices,  and  the computed minimum weight is the distance [23Xd_Z[123X of the code.
  The  number  of  columns [23Xn[123X is the block length of the code, and it encodes [23Xk[123X
  qudits,  where [23Xk=n-[123X[10Xrank[110X[23X(H_X)-[123X[10Xrank[110X[23X(H_Z)[123X. To completely characterize the code,
  we  also  need  the  distance  [23Xd_X[123X which can be obtained by calling the same
  function  with the two matrices interchanged. The conventional code distance
  [23Xd[123X  is  the  minimum  of [23Xd_X[123X and [23Xd_Z[123X. Parameters of such a [23Xq[123X-ary CSS code are
  commonly  denoted  as  [23X[[n,k,(d_X,d_Z)]]_q[123X,  or  simply [23X[[n,k,d]]_q[123X as for a
  general [23Xq[123X-ary stabilizer code.[133X
  
  
  [1X3.1-5 [33X[0;0YGeneric version of the algorithm[133X[101X
  
  [33X[0;0YCSS  codes  are  a  subclass  of general [23XF[123X-linear stabilizer codes which are
  specified  by  a single stabilizer generator matrix [23XH=(A|B)[123X written in terms
  of  two  blocks of [23Xn[123X columns each. The orthogonality condition is given in a
  symplectic form,[133X
  
  
  [24X[33X[0;6YA B^T-B A^T=0,[133X
  
  [124X
  
  [33X[0;0Yor,   equivalently,   as  orthogonality  between  the  rows  of  [23XH[123X  and  the
  symplectic-dual matrix [23X\tilde H=(B|-A)[123X. Non-trivial vectors in the code must
  be  orthogonal  to  the rows of [23XP=\tilde H[123X and linearly independent from the
  rows of [23XQ=H[123X. The difference with the CSS version of the algorithm is that we
  must  minimize  the  [13Xsymplectic[113X  weight  of  [23Xc=(a|b)[123X, given by the number of
  positions [23Xi[123X, [23X1\le i\le n[123X, such that either [23Xa_i[123X or [23Xb_i[123X (or both) be non-zero.[133X
  
  [33X[0;0YThe parameters of such a code are denoted as [23X[[n,k,d]]_q[123X, where [23Xk=n-[123X[10Xrank[110X[23XH[123X is
  the  number  of  encoded qudits, and [23Xd[123X is the minimal symplectic weight of a
  non-trivial vector in the code. It is easy to check that a CSS code can also
  be represented in terms of a single stabilizer generator matrix. Namely, for
  a  CSS code with generators [23XH_X[123X and [23XH_Z[123X, the stabilizer generator matrix has
  a block-diagonal form, [23XH=[123X[10Xdiag[110X[23X(H_X,H_Z)[123X.[133X
  
  [33X[0;0YA  version  of  the  algorithm  for  general  [23XF[123X-linear  stabilizer  codes is
  implemented in the function [10XDistRandStab[110X ([14X4.1[114X).[133X
  
  [33X[0;0Y[13XImportant  Notice[113X:  In general, here one could use most general permutations
  of  [23X2n[123X columns, or restricted permutations of [23Xn[123X two-column blocks preserving
  the  pair  structure  of  the  matrix. While the latter method would be much
  faster, there is no guarantee that every vector would be found. As a result,
  we decided to use general permutations of [23X2n[123X columns.[133X
  
  
  [1X3.2 [33X[0;0YSome more details[133X[101X
  
  
  [1X3.2-1 [33X[0;0YQuantum stabilizer codes[133X[101X
  
  [33X[0;0YRepresentation  of  quantum  codes  in  terms  of  linear  spaces  is just a
  convenient map. In the case [23Xq=2[123X (qubits), the details can be found, e.g., in
  the  book  of  Nielsen  and Chuang, [NC00]. Further details on the theory of
  stabilizer  quantum  error  correcting codes based on qubits can be found in
  the  Caltech  Ph.D. thesis of Daniel Gottesman [Got97] and in the definitive
  1997  paper  by  Calderbank,  Rains,  Shor,  and  Sloane [CRSS98]. Theory of
  stabilizer  quantum  codes  based  on  qudits  ([23Xq[123X-state quantum systems) was
  developed  by  Ashikhmin and Knill [AK01] (prime fields with [23Xq[123X prime) and by
  Ketkar,  Klappenecker, Kumar, & Sarvepalli [KKKS06] (extension fields with [23Xq[123X
  a non-trivial power of a prime).[133X
  
  [33X[0;0YIn  the  binary  case  (more  generally,  when [23Xq[123X is a prime), [23XF[123X-linear codes
  coincide  with  [13Xadditive[113X  codes.  The  [13Xlinear[113X  codes [e.g., over [23X\mathop{\rm
  GF}(4)[123X  in  the  binary  case  [CRSS98]]  is  a different construction which
  assumes  an  additional  symmetry. A brief summary of [23XF[123X-linear quantum codes
  [where  [23XF=\mathop{\rm GF}(q)[123X with [23Xq=p^m[123X, [23Xm>1[123X a non-trivial power of a prime]
  can  be  found  in  the  introduction  of  Ref.  [ZP20]. The construction is
  equivalent  to  a  more  physical  approach in terms of a lifted Pauli group
  suggested by Gottesman [Got14].[133X
  
  
  [1X3.2-2 [33X[0;0YThe algorithm[133X[101X
  
  [33X[0;0YCase of [13Xclassical linear codes[113X[133X
  
  [33X[0;0YThe  algorithm  [14X3.1-2[114X  is  closely  related  to  the  algorithm  for finding
  minimum-weight  codewords  in  a  classical linear code as presented by Leon
  [Leo88],  and  a  related family of [13Xinformation set[113X (IS) decoding algorithms
  [Kru89] [CG90].[133X
  
  [33X[0;0YConsider  a classical linear [23Xq[123X-ary code [23X[n,k,d]_q[123X encoding [23Xk[123X symbols into [23Xn[123X,
  specified  by  a  generator  matrix  [23XG[123X of rank [23Xk[123X. Using Gauss' algorithm and
  column  permutations, the generator matrix can be rendered into a [13Xsystematic
  form[113X, [23XG=(I|A)[123X, where the two blocks are [23XI[123X, the size-[23Xk[123X identity matrix, and a
  [23Xk[123X  by  [23Xn-k[123X  matrix  [23XA[123X.  In  such a representation, the first [23Xk[123X positions are
  called  the information set of the code (since the corresponding symbols are
  transmitted  directly) and the remaining [23Xn-k[123X symbols provide the redundancy.
  Any  [23Xk[123X  linearly-independent  columns  of [23XG[123X can be chosen as the information
  set,  which defines the systematic form of [23XG[123X up to a permutation of the rows
  of [23XA[123X.[133X
  
  [33X[0;0YThe  IS algorithm and the original performance bounds [Leo88] [Kru89] [CG90]
  are  based  on the observation that for a long random code a set of [23Xk+\Delta[123X
  randomly  selected  columns, with [23X\Delta[123X of order one, are likely to contain
  an  information  set.  ISs  are (approximately) in one-to-one correspondence
  with the column permutations, and a random IS can thus be generated as a set
  of  [13Xpivot[113X columns in the Gauss' algorithm after a random column permutation.
  Thus, if there is a codeword [23Xc[123X of weight [23Xd[123X, the probability to find it among
  the rows of reduced-row-echelon form [23XG_S[123X after a column permutation [23XS[123X can be
  estimated  as  that  for a randomly selected set of [23Xk[123X columns to hit exactly
  one non-zero position in [23Xc[123X.[133X
  
  [33X[0;0YThe  statistics  of  ISs  is  more  complicated in other ensembles of random
  codes, e.g., in linear [13Xlow-density parity-check[113X (LDPC) codes where the check
  matrix  [23XH[123X  (of  rank  [23Xn-k[123X  and  with  rows  orthogonal  to  those  of  [23XG[123X) is
  additionally  required  to  be sparse. Nevertheless, a provable bound can be
  obtained for a related [13Xcovering set[113X (CS) algorithm where a randomly selected
  set of [23Xs\ge k-1[123X positions of a putative codeword are set to be zero, and the
  remaining positions are constructed with the help of linear algebra. In this
  case,  the  optimal  choice  [DKP17]  is to take [23Xs\approx n(1-\theta)[123X, where
  [23X\theta  [123X  is  the  erasure  threshold  of  the  family  of  the  codes under
  consideration.  Since  [23X\theta\ge  R[123X (here [23XR=k/n[123X is the code rate), here more
  zeros must be selected, and the complexity would grow (assuming the distance
  [23Xd[123X remains the same, which is usually [13Xnot[113X the case for LDPC codes).[133X
  
  [33X[0;0YNote  however  that  rows  of  [23XG_S[123X  other  than the last are not expected to
  contain  as  many  zeros (e.g., the first row is only guaranteed to have [23Xk-1[123X
  zeros),  so  it  is  [13Xpossible[113X  that  the  performance of the IS algorithm on
  classical LDPC codes is actually closer to that on random codes as estimated
  by Leon [Leo88].[133X
  
  [33X[0;0YCase of [13Xquantum CSS codes[113X[133X
  
  [33X[0;0YIn  the  case of a random CSS code (with matrices [23XP[123X and [23XQ[123X selected randomly,
  with  the only requirement being the orthogonality between the rows of [23XP[123X and
  [23XQ[123X),  the  performance of the algorithm [14X3.1-2[114X can be estimated as that of the
  CS  algorithm,  in  terms of the erasure threshold of a linear code with the
  parity matrix [23XP[123X, see [DKP17].[133X
  
  [33X[0;0YUnfortunately,  such  an  estimate fails dramatically in the case of [13Xquantum
  LDPC  codes[113X,  where rows of [23XP[123X and [23XQ[123X have weights bounded by some constant [23Xw[123X.
  This  is  a reasonable requirement since the corresponding quantum operators
  (supported on [23Xw[123X qudits) have to actually be measured frequently as a part of
  the  operation  of  the  code,  and  it  is  reasonable  to  expect that the
  measurement  accuracy  goes  down (exponentially) quickly as [23Xw[123X is increased.
  Then,  the  linear  code  orthogonal to the rows of [23XP[123X has the distance [23X\le w[123X
  (the  minimal  weight  of  the  rows  of  [23XQ[123X),  and the corresponding erasure
  threshold  is  exactly  zero.  In other words, there is a finite probability
  that  a  randomly selected [23Xw[123X symbols contain a vector orthogonal to the rows
  of  [23XP[123X  (and  such  a vector would likely have nothing to do with non-trivial
  [13Xquantum[113X codewords which must be linearly independent from the rows of [23XQ[123X).[133X
  
  [33X[0;0YOn  the  other  hand,  for  every  permutation [23XS[123X in the algorithm [14X3.1-2[114X, the
  matrix [23XG_S[123X contains exactly [23Xk=n-[123X[10Xrank[110X[23X(P)-[123X[10Xrank[110X[23X(Q)[123X rows orthogonal to rows of [23XP[123X
  and  linearly  independent  from rows of [23XQ[123X (with columns properly permuted).
  These  vectors  contain  at least [23Xs[123X zeros, where [23X[1-\theta_*(P,Q)] n\le s\le
  n-[123X[10Xrank[110X[23X(Q)[123X, where [23X\theta_*(P,Q)[123X is the erasure threshold for [23XZ[123X-like codewords
  in the quantum CSS code with [23XH_X=P[123X and [23XH_Z=Q[123X.[133X
  
  [33X[0;0Y[13XWhat is it that we do not understand?[113X[133X
  
  [33X[0;0YWhat  missing  is an understanding of the statistics of the ISs of interest,
  namely,  the  ISs  that  overlap with a minimum-weight codeword in one (or a
  few) positions.[133X
  
  [33X[0;0YSecond,  we  know  that  a  given  column  permutation [23XS[123X leads to the unique
  information  set,  and  that  every  information  set  can  be obtained by a
  suitably  chosen column permutation. However, there is no guarantee that the
  resulting  information sets have equal probabilities. In fact, it is easy to
  construct  small matrices where different information sets are obtained from
  different   numbers   of   column  permutations  (and  thus  have  [13Xdifferent[113X
  probabilities). It is not clear whether some of the ISs may have vanishingly
  small  probabilities  in  the  limit  of  large  codes;  in  such a case the
  algorithm may take an excessively long time to converge.[133X
  
  
  [1X3.3 [33X[0;0YEmpirical estimate of the success probability[133X[101X
  
  [33X[0;0YThe  probability  to  find a codeword after [23XN[123X rounds of the algorithm can be
  estimated  empirically, by counting the number of times each codeword of the
  minimum  weight was discovered. We [13Xexpect[113X the probability [23XP(c)[123X to discover a
  given  codeword [23Xc[123X to depend only on its (symplectic) weight [10Xwgt[110X[23X(c)[123X, with the
  probability  a  monotonously  decreasing function of the weight. If, after [23XN[123X
  steps,  codewords  [23Xc_1[123X, [23Xc_2[123X, [23X\ldots[123X , [23Xc_m[123X of the same (minimal) weight [23Xw[123X are
  discovered  [23Xn_1[123X,  [23Xn_2[123X, [23X\ldots[123X , [23Xn_m[123X times, respectively, we can estimate the
  corresponding Poisson parameter as[133X
  
  
  [24X[33X[0;6Y\lambda_w =\frac{1}{N m}\sum_{i=1}^m n_i.[133X
  
  [124X
  
  [33X[0;0YThen,  the probability that a codeword [23Xc_0[123X of the true minimal weight [23X d < w
  [123X  be  [13Xnot[113X discovered after [23XN[123X steps can be upper bounded as (the inequalities
  tend to saturate and become equalities in the limit of small [23X\lambda_w[123X)[133X
  
  
  [24X[33X[0;6YP_{\rm            fail}            <            (1-\lambda_w)^N            <
  e^{-N\lambda_w}=\exp\left(-m^{-1}\sum_{i=1}^m n_i\right)\equiv \exp(-\langle
  n\rangle).[133X
  
  [124X
  
  [33X[0;0YThus,  the probability to fail is decreasing as an exponent of the parameter
  [23X\langle  n\rangle[123X, the [13Xaverage number of times a minimum-weight codeword has
  been found.[113X[133X
  
  [33X[0;0YThe  hypothesis about all [23XP(c_i)[123X being equal to [23X\lambda_w[123X is testable, e.g.,
  if   one   considers   the  distribution  of  the  ratios  [23Xx_i=n_i/N[123X,  where
  [23XN=\sum_{i=1}^m  n_i[123X is the total number of codewords found. These quantities
  sum   up   to   one   and   are   distributed   according   to   multinomial
  distribution[Ste53].  Further,  under  our  assumption  of  all [23XP(c_i)[123X being
  equal,   we  also  expect  the  outcome  probabilities  in  the  multinomial
  distribution to be all equal, [23X\pi_i=1/m[123X, [23X1\le i\le m[123X.[133X
  
  [33X[0;0YThis  hypothesis  can  be tested using Pearson's [23X\chi^2[123X test. Namely, in the
  limit where the total number of observations [23XN[123X diverges, the quantity[133X
  
  
  [24X[33X[0;6YX^2=\sum_{i=1}^m   \frac{(n_i-N   \pi_i)^2}{   N\pi_i}=   N^{-1}\sum_{i=1}^m
  \frac{n_i^2}{\pi_i}-N         \stackrel{\pi_i=1/m}\to\frac{m}{N}\sum_{i=1}^m
  n_i^2-N,[133X
  
  [124X
  
  [33X[0;0Yis  expected  to  be  distributed according to the [23X\chi^2_{m-1}[123X distribution
  with [23Xm-1[123X parameters, see [CL54] [Cra99].[133X
  
  [33X[0;0YIn  practice,  we can approximate with the [23X\chi^2_{m-1}[123X distribution as long
  as  the  total  [23XN[123X  be  large compared to the number [23Xm[123X of the codewords found
  (i.e.,  the  average  [23X\langle  n\rangle[123X  must  be  large,  which is the same
  condition as needed for confidence in the result.)[133X
  
  [33X[0;0YWith  [10Xdebug[4][110X  set  (binary value 8) in [10XDistRandCSS[110X and [10XDistRandStab[110X ([14X4.1[114X),
  whenever  more  than one minimum-weight vector is found, the quantity [23XX^2[123X is
  computed  and output along with the average number of times [23X\langle n\rangle[123X
  a  minimum-weight  codeword  has  been found. However, no attempt is made to
  analyze  the  corresponding  value  or  calculate the likelihood of the null
  hypothesis that the codewords be equiprobable.[133X
  
