This task has been studied before and has been published in these papers:
Cardoso, J. F. (1998, May). Multidimensional independent component analysis. In Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on (Vol. 4, pp. 1941-1944). IEEE.
Dirk Callaerts, "Signal Separation Methods based on Singular Value Decomposition and their Application to the Real-Time Extraction of the Fetal Electrocardiogram from Cutaneous Recordings", Ph.D. Thesis, K.U.Leuven - E.E. Dept., Dec. 1989.
L. De Lathauwer, B. De Moor, J. Vandewalle, "Fetal Electrocardiogram Extraction by Source Subspace Separation", Proc. IEEE SP / ATHOS Workshop on HOS, June 12-14, 1995, Girona, Spain, pp. 134-138.
In this workbook I am going to show you how a similar result can be obtained using the ICA algorithms available in the Shogun Machine Learning Toolbox.
First we need some data, luckily an ECG dataset is distributed in the Shogun data repository. So the first step is to change the directory then we'll load the data.
# change to the shogun-data directory import os SHOGUN_DATA_DIR=os.getenv('SHOGUN_DATA_DIR', '../../../data') os.chdir(os.path.join(SHOGUN_DATA_DIR, 'ica'))
import numpy as np # load data # Data originally from: # http://perso.telecom-paristech.fr/~cardoso/icacentral/base_single.html data = np.loadtxt('foetal_ecg.dat') # time steps time_steps = data[:,0] # abdominal signals abdominal2 = data[:,1] abdominal3 = data[:,2] abdominal4 = data[:,3] abdominal5 = data[:,4] abdominal6 = data[:,5] # thoracic signals thoracic7 = data[:,6] thoracic8 = data[:,7] thoracic9 = data[:,8]
Before we go any further let's take a look at this data by plotting it:
%matplotlib inline # plot signals import pylab as pl # abdominal signals for i in range(1,6): pl.figure(figsize=(14,3)) pl.plot(time_steps, data[:,i], 'r') pl.title('Abdominal %d' % (i)) pl.grid() pl.show() # thoracic signals for i in range(6,9): pl.figure(figsize=(14,3)) pl.plot(time_steps, data[:,i], 'r') pl.title('Thoracic %d' % (i)) pl.grid() pl.show()
The peaks in the plot represent a heart beat but its pretty hard to interpret and I know I definitely can't see two distinc signals, lets see what we can do with ICA!
In general for performing Source Separation we need at least as many mixed signals as sources we're hoping to separate and in this case we actually have a lot more (9 mixtures but there is only 2 sources, mother and baby). There are several different approaches for handling this situation, some algorithms are specifically designed to handle this case while other times the data is pre-processed with Principal Component Analysis (PCA). It is also common to simply apply the separation to all the sources and then choose some of the extracted signal manually or using some other know criteria which is what I'll be showing in this example.
Now we create our ICA data set and convert to a Shogun features type:
from shogun import RealFeatures # Signal Matrix X X = (np.c_[abdominal2, abdominal3, abdominal4, abdominal5, abdominal6, thoracic7,thoracic8,thoracic9]).T # Convert to features for shogun mixed_signals = RealFeatures((X).astype(np.float64))
Next we apply the ICA algorithm to separate the sources:
from shogun import SOBI # Separating with SOBI sep = SOBI() sep.set_tau(1.0*np.arange(0,120)) signals = sep.apply(mixed_signals) S_ = signals.get_feature_matrix()
And we plot the separated signals:
# Show separation results # Separated Signal i for i in range(S_.shape): pl.figure(figsize=(14,3)) pl.plot(time_steps, S_[i], 'r') pl.title('Separated Signal %d' % (i+1)) pl.grid() pl.show()