Iterating Johansen test over multiple pairs

Luca_Donghi_13dBb · August 4, 2021, 4:38pm

Hi,

I have tried to iterate the Johansen test over multiple pairs of currencies, but now I do not know how to retrieve the results in an orderly manner. The code is the following:

assets = ['eur_usd', 
          'usd_jpy', 
          'gbp_usd', 
          'usd_chf',
          'aud_usd', 
          'nzd_usd',
          'usd_cad',
          'usd_nok',
          'usd_sek',
          'usd_mxn',
          'usd_try',
          'usd_zar']

import itertools
combinations = itertools.combinations(assets, r=2)

results = []
for a, b in combinations:
        data = pd.DataFrame(usd_pairs[[a, b]])
        results = coint_johansen(data,0,1)

where "usd_pairs" is my dataframe of currencies prices.

The code seems to work, but now I don't know how to access the statistics in a manner that can help me select the best pairs.

Thanks for your help!

Luca

Vibhu_uxX · August 5, 2021, 3:54am

Hi Luca,

You need to print the trace statistics and Eigen statistics inside your for loop. It will give the test statistics and Eigen statistics for all possible combinations of pairs. You can compare these values with critical values at different confidence levels.

import itertools
combinations = itertools.combinations(assets, r=2)

results = []
for a, b in combinations:
    data = pd.DataFrame(usd_pairs[[a, b]])
    result = coint_johansen(data,0,1)
    # Print trace statistics and eigen statistics
    print ('--------------------------------------------------')
    print ('--> Trace Statistics')
    print ('variable statistic Crit-90% Crit-95%  Crit-99%')
    for i in range(len(result.lr1)):
        print ("r <= " + str(i), round(result.lr1[i], 4), round(result.cvt[i, 0],4), round(result.cvt[i, 1],4), round(result.cvt[i, 2],4))
        print ('--------------------------------------------------')
        print ('--> Eigen Statistics')
        print ('variable statistic Crit-90% Crit-95%  Crit-99%')
    for i in range(len(result.lr2)):
        print ("r <= " + str(i), round(result.lr2[i], 4), round(result.cvm[i, 0],4), round(result.cvm[i, 1],4), round(result.cvm[i, 2],4))

Hope that helps. Thanks!

Luca_Donghi_13dBb · August 5, 2021, 7:21am

Dear Vibhu,

many thanks for your quick reply. I have followed your advice and it worked, but in the output I am not able to identify to which pairs the result of test belongs to. Do you know how can I tweak the code to obtain a more useful output?

The output I get is the below one:

--------------------------------------------------
--> Trace Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 11.0503 13.4294 15.4943 19.9349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 1 1.4775 2.7055 3.8415 6.6349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 9.5728 12.2971 14.2639 18.52
r <= 1 1.4775 2.7055 3.8415 6.6349
--------------------------------------------------
--> Trace Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 10.883 13.4294 15.4943 19.9349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 1 3.72 2.7055 3.8415 6.6349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 7.1629 12.2971 14.2639 18.52
r <= 1 3.72 2.7055 3.8415 6.6349
--------------------------------------------------
--> Trace Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 20.2819 13.4294 15.4943 19.9349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 1 4.7692 2.7055 3.8415 6.6349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 15.5126 12.2971 14.2639 18.52
r <= 1 4.7692 2.7055 3.8415 6.6349
--------------------------------------------------
--> Trace Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 14.1388 13.4294 15.4943 19.9349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 1 1.3412 2.7055 3.8415 6.6349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 12.7977 12.2971 14.2639 18.52
r <= 1 1.3412 2.7055 3.8415 6.6349
--------------------------------------------------
--> Trace Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 15.943 13.4294 15.4943 19.9349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 1 3.723 2.7055 3.8415 6.6349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 12.22 12.2971 14.2639 18.52
r <= 1 3.723 2.7055 3.8415 6.6349
--------------------------------------------------
--> Trace Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 16.0497 13.4294 15.4943 19.9349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 1 1.609 2.7055 3.8415 6.6349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 14.4407 12.2971 14.2639 18.52
r <= 1 1.609 2.7055 3.8415 6.6349
--------------------------------------------------
--> Trace Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 15.9143 13.4294 15.4943 19.9349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 1 3.0217 2.7055 3.8415 6.6349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 12.8927 12.2971 14.2639 18.52
r <= 1 3.0217 2.7055 3.8415 6.6349
--------------------------------------------------
--> Trace Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 13.6912 13.4294 15.4943 19.9349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 1 1.3858 2.7055 3.8415 6.6349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 12.3054 12.2971 14.2639 18.52
r <= 1 1.3858 2.7055 3.8415 6.6349
--------------------------------------------------
--> Trace Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 11.8216 13.4294 15.4943 19.9349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 1 2.1262 2.7055 3.8415 6.6349
--------------------------------------------------
--> Eigen Statistics
variable statistic Crit-90% Crit-95%  Crit-99%
r <= 0 9.6954 12.2971 14.263

Rishabh_Mittal_853Xn · August 6, 2021, 3:01am

Hey Luca!

You can simply add a "print" statement in the loop. I have added a line of code in line number 9. You can check and let me know if that resolves your query.

import itertools
combinations = itertools.combinations(assets, r=2)

results = []
for a, b in combinations:
    data = pd.DataFrame(usd_pairs[[a, b]])
    result = coint_johansen(data,0,1)

    print(f"\nPairs: {[a, b]}\n")

    # Print trace statistics and eigen statistics
    print ('--------------------------------------------------')
    print ('--> Trace Statistics')
    print ('variable statistic Crit-90% Crit-95%  Crit-99%')
    for i in range(len(result.lr1)):
        print ("r <= " + str(i), round(result.lr1[i], 4), round(result.cvt[i, 0],4), round(result.cvt[i, 1],4), round(result.cvt[i, 2],4))
        print ('--------------------------------------------------')
        print ('--> Eigen Statistics')
        print ('variable statistic Crit-90% Crit-95%  Crit-99%')
    for i in range(len(result.lr2)):
        print ("r <= " + str(i), round(result.lr2[i], 4), round(result.cvm[i, 0],4), round(result.cvm[i, 1],4), round(result.cvm[i, 2],4))

Luca_Donghi_13dBb · August 6, 2021, 10:06am

Hey Rishabh,

many thanks for your feedback. It works perfectly, but (since there are lots of combinations) if I would like to tell Python to print just those pairs where the Trace and Eigen statistics are greater than the Crit-90% level, how can I do?

I tried with the below code, inserting an if statement. It doesn't throw me errors, but it doesn't print anything unfortunately.

assets = ['eur_usd', 
          'usd_jpy', 
          'gbp_usd', 
          'usd_chf',
          'aud_usd', 
          'nzd_usd',
          'usd_cad',
          'usd_nok',
          'usd_sek',
          'usd_mxn',
          'usd_try',
          'usd_zar']

import itertools
combinations = itertools.combinations(assets, r=2)

for a, b in combinations:
    data = pd.DataFrame(usd_pairs[[a, b]])
    result = coint_johansen(data,0,1)

    print(f"\nPairs: {[a, b]}\n")
    

    # Print trace statistics and eigen statistics
    print ('--------------------------------------------------')
    print ('--> Trace Statistics')
    print ('variable statistic Crit-90%')
    for i in range(len(result.lr1)):
        if result.lr1[i] > result.cvt[i, 0]:
            print ("r <= " + str(i), round(result.lr1[i], 4), round(result.cvt[i, 0],4))
    print ('--------------------------------------------------')
    print ('--> Eigen Statistics')
    print ('variable statistic Crit-90%')
    for i in range(len(result.lr2)):
        if result.lr2[i] > result.cvm[i, 0]:
            print ("r <= " + str(i), round(result.lr2[i], 4), round(result.cvm[i, 0],4))

Many thanks in advance for your help!

Luca

Satyapriya_Chaudhari_63X1f · August 9, 2021, 10:08am

Hi Luca,

Ideally, the if statement should help you filter the pairs based on the condition you mentioned. I am not sure, what the issue is. Can you please share the complete code file and the data file so that I can have a look?

Thanks for your patience.

Luca_Donghi_13dBb · August 9, 2021, 4:25pm

Hi Satyapriya,

my issue was that with the code provided by Rishabh I do get the complete list of Johansen tests for each combinations, but I have to go one by one to select the right pairs (those with best staistics). So, I asked Rishabh if he knew how to obtain the same output but filtered (as I tried to do in the above comment).

By the way, I managed to find a solution on my own (see below):

assets = ['eur_usd', 
          'usd_jpy', 
          'gbp_usd', 
          'usd_chf',
          'aud_usd', 
          'nzd_usd',
          'usd_cad',
          'usd_nok',
          'usd_sek',
          'usd_mxn',
          'usd_try',
          'usd_zar']


combinations = itertools.combinations(assets, r=2)

cointegrated_pairs = []

for a, b in combinations:
    data = pd.DataFrame(df_jt[[a, b]])
    result = coint_johansen(data,0,1)

    
    # The trace statistic and maximum eigenvalue statistic are stored in lr1 and lr2;
    # see if they exceeded the confidence threshold
    if np.all(result.lr1[:1] > result.cvt[:, 0]) and np.all(result.lr2[:1] > result.cvm[:, 0]):
        
        cointegrated_pairs.append(dict(
            a=a,
            b=b
        ))
        
print(cointegrated_pairs)

I had to struggle a little bit with the code, but eventually I obtained the desired output.

Thanks for your support.

Best regards,

Luca

Satyapriya_Chaudhari_63X1f · August 10, 2021, 2:07pm

Hi Luca,

Glad to know that you were able to resolve the issue.

Best Regards,

Satyapriya