You are not logged in.

Unanswered posts



Important! This site has been replaced. All content here is read-only. Please visit our brand-new community at https://community.talend.com/. We look forward to hearing from you there!



#1 2014-11-12 06:22:16

hszz8888
Member
4 posts

hszz8888 said:

[resolved] New Here Need Help

I'm a new guy here. Talend is so cool tools.
But get a assignment like this:
   Using Talend as the ETL solution, read the files from the ftp site and load the data to a local database with following features.
·         Before loading the data into database, log the list of files in log table and update its status (Downloaded , Processed , Failed)
·         Any duplicate file should be notified via email.(no duplicate files to be loaded)
·         Process can run only if all 10 files are available.
·         Notification of success and failure of job along with Job log.
·         Propose a quality report for daily data load.
I did some research and successfully downloaded file from FTP server by using tFTPconnection, tFTPlist and tFTPget.
But I cannot find any tutorial about the file operation. Could you give me a hand about the following questions(Bold)?
I don't need the detail about this assignment. Please tell me what kind of palettes(tFileList,etc) I can use for this assignment.
Thank you very much! 
Regards,

Offline

#2 2014-11-12 07:21:28

sanvaibhav
Member
1719 posts

sanvaibhav said:

Re: [resolved] New Here Need Help

Before loading the data into database, log the list of files in log table and update its status (Downloaded , Processed , Failed)
>> You can get the variable name tFileListCurrentFileName for all input file, you will have to use for each iteration and insert it into the database
By dividing the job into sections, you can get Downloaded, Processed and Failed... as these are the versions of same file name with different status, the value will not change for each iteration
Any duplicate file should be notified via email.(no duplicate files to be loaded)


>> In order to do this, you will have to create a lookup table for successfully loading files, use tMap and inner join with the incoming file name variable above which can be stored in context variable as well.
- Once lookup is done proceed to the subjob which loads data into the database table. This will prevent duplicate file loading
Process can run only if all 10 files are available.


>> You will have to write another master job which reads the files and counts it, and with if clause connector you can call subjob to process those files...
Notification of success and failure of job along with Job log.


>> You can use tSendMail component when master job is successful along with whatever logs you want... either you can use routine variables to set the audit information, or compile all that information into the database table, read that table, format it and include it in HTML mail table
Propose a quality report for daily data load.


>> You can derive report from the audit log...




I think you got some insight on how to do it...




Thanks
Vaibhav


Talend Certified Consultant

Offline

#3 2014-11-12 16:37:25

hszz8888
Member
4 posts

hszz8888 said:

Re: [resolved] New Here Need Help

sanvaibhav wrote:

Before loading the data into database, log the list of files in log table and update its status (Downloaded , Processed , Failed)
>> You can get the variable name tFileListCurrentFileName for all input file, you will have to use for each iteration and insert it into the database
By dividing the job into sections, you can get Downloaded, Processed and Failed... as these are the versions of same file name with different status, the value will not change for each iteration
Any duplicate file should be notified via email.(no duplicate files to be loaded)


>> In order to do this, you will have to create a lookup table for successfully loading files, use tMap and inner join with the incoming file name variable above which can be stored in context variable as well.
- Once lookup is done proceed to the subjob which loads data into the database table. This will prevent duplicate file loading
Process can run only if all 10 files are available.


>> You will have to write another master job which reads the files and counts it, and with if clause connector you can call subjob to process those files...
Notification of success and failure of job along with Job log.


>> You can use tSendMail component when master job is successful along with whatever logs you want... either you can use routine variables to set the audit information, or compile all that information into the database table, read that table, format it and include it in HTML mail table
Propose a quality report for daily data load.


>> You can derive report from the audit log...




I think you got some insight on how to do it...




Thanks
Vaibhav

Thank you very much. I will carefully read your post!

Offline

#4 2014-11-13 05:46:33

hszz8888
Member
4 posts

hszz8888 said:

Re: [resolved] New Here Need Help

I tried my best. Does this works?
log is the tablet that I used for storing the file name. 
 
mini_blob_20141113-0619.png

Last edited by hszz8888 (2014-11-13 06:19:31)

Offline

#5 2014-11-13 06:49:00

sanvaibhav
Member
1719 posts

sanvaibhav said:

Re: [resolved] New Here Need Help

You will have to use RunIf clause to execute your subjob based on 10 files which you need..
This count of file you can get from variable or you can have another variable as context.cnt=context.cnt+1 in tjavarow between the flow
But you are on the right track..

Vaibhav


Talend Certified Consultant

Offline

#6 2014-11-13 15:56:15

hszz8888
Member
4 posts

hszz8888 said:

Re: [resolved] New Here Need Help

sanvaibhav wrote:

You will have to use RunIf clause to execute your subjob based on 10 files which you need..
This count of file you can get from variable or you can have another variable as context.cnt=context.cnt+1 in tjavarow between the flow
But you are on the right track..

Vaibhav

Thank you!

Offline

Board footer

Talend Contributor Agreement - Talend Website Privacy Policy