Skip to content →

Category: Tech Blog

Here you can find my random notes on software design and development, software engineering, coding, and all other tech-related stuff.

File editing within a docker container with cat

Leave a Comment

Effective Data Science Team Leadership: Challenges

Team leaders play a vital role in the success of any team, whether it’s a small start-up or a large corporation. A team leader is like a conductor leading an orchestra. Just as a conductor brings together a group of musicians with different instruments and backgrounds, a team leader brings together a group of individuals with different skills and expertise. The conductor’s job is to create a cohesive and harmonious piece of music, and the team leader’s job is to create a cohesive and harmonious team.

Like a conductor, a team leader must have a clear vision of what they want to achieve and be able to communicate that vision to their team. They must also be able to listen to the needs and concerns of their team members and adapt their approach as needed. Just as a conductor may need to make adjustments to the tempo or dynamics of a piece of music based on the capabilities of the musicians, a team leader may need to make adjustments to their management style based on the needs and skills of their team.

A team leader must also be able to motivate and inspire their team, just as a conductor must be able to inspire the musicians to give their best performance. This may involve setting clear goals and expectations, providing constructive feedback, and recognizing and rewarding good performance.

Finally, a team leader must be able to bring out the best in their team, just as a conductor must be able to bring out the best in their musicians. This may involve identifying and addressing any issues or conflicts within the team, as well as fostering a positive and inclusive work environment.

The same principles of team leadership that apply to an orchestra also apply to businesses. Good business leaders have a clear vision for their company and are able to communicate that vision to their team. They are able to motivate and inspire their employees, and bring out the best in them. They are also able to adapt to changing circumstances and make adjustments as needed.

The impact of good business leaders cannot be overstated. Good business leaders have the ability to take a company to new heights, driving innovation and growth. They are able to create a positive and productive work environment, which can lead to increased employee satisfaction and retention. Good business leaders are also able to make sound decisions that drive the success of the company. One example of a good business leader is Jeff Bezos, the Founder of Amazon. Bezos has been able to take Amazon from a small online bookstore to a global e-commerce giant. He has a clear vision for the company and has consistently been able to adapt to changing circumstances and stay ahead of the competition. Bezos has also fostered a culture of innovation at Amazon, and has played a key role in driving the company’s growth.

On the other hand, bad business leaders can have a detrimental impact on a company. They may lack a clear vision or be unable to effectively communicate that vision to their team. They may also be inflexible and resistant to change, which can stifle innovation and growth. Bad business leaders may also create a negative work environment, leading to low employee morale and high turnover. Ultimately, bad business leaders can make poor decisions that can lead to the failure of the company. An example of a bad business leader is Elizabeth Holmes, the former CEO of Theranos. Holmes had a vision for revolutionizing the healthcare industry with her company’s technology, but was unable to effectively communicate that vision to her team and to gear the team towards a realistic vision. She was also resistant to change and unwilling to listen to the concerns of her team, leading to a number of problems within the company. Ultimately, Holmes’ poor leadership contributed to the failure of Theranos.

In conclusion, good business leaders are essential for the success of a company. They are able to drive innovation and growth, create a positive work environment, and make sound decisions. On the other hand, bad business leaders can have a detrimental impact on a company, leading to failure. The impact of good and bad business leaders can be seen in real-life examples throughout history and it is clear that strong leadership is crucial for the success of any business.


Leading software engineering teams

Not every leader has the opportunity to lead large businesses, but software engineering leaders are less difficult to find. However, leading a small or medium-sized software engineering team presents its own set of unique challenges and opportunities. It is also very different from leading other types of teams. Here, we explore some of the key characteristics of software engineering teams and how they differ from other types of teams, and we will discuss some of the key challenges and considerations for leaders of these teams.

One of the most significant differences between software engineering teams and other types of teams is the skill level and focus of team members. Software engineering team members are typically highly skilled and deeply knowledgeable about their work, with a focus on problem-solving and creativity. As a leader, it is important to recognize and respect the skills and expertise of your team members, and to create an environment that allows them to focus on these areas. This may involve setting clear goals and objectives, providing resources and support, and encouraging collaboration and innovation within the team.

Another key difference between software engineering teams and other types of teams is the organizational structure. Many software engineering teams operate with a flat organizational structure, with little or no hierarchy. This allows team members to collaborate and communicate more easily, and can be a key factor in the team’s success. As a leader, it is important to embrace this flat structure and to foster an open and collaborative team culture. This may involve facilitating communication, resolving conflicts, and building trust and collaboration within the team.

Software development projects are also typically very complex, with many different components and dependencies. As a leader, it is important to have a deep understanding of the project and to be able to effectively communicate this understanding to your team. You should also be able to anticipate and plan for potential challenges and roadblocks, and be able to adapt and pivot as needed.

Another major difference between software engineering teams and other types of teams is the rapid pace of technological change in the field. New technologies and approaches are constantly emerging, and it is important for leaders of software engineering teams to stay up to date with these developments and to be able to effectively incorporate them into their team’s work. This may require continuous learning and professional development efforts, such as attending conferences, taking online courses, and reading industry publications.

In addition to these technical challenges, leaders of software engineering teams also need to be able to effectively manage stakeholder expectations and team dynamics. Software development projects often have many stakeholders, with different expectations and priorities, and it is important for leaders to be able to effectively manage these expectations and to communicate effectively with stakeholders to ensure that the project stays on track and meets its goals. Managing team dynamics is also critical, as software engineering teams can be diverse, with team members from different backgrounds and with different personalities. As a leader, it is important to be able to effectively manage team dynamics and to foster a positive and inclusive team culture.

By understanding the characteristics of software engineering teams and the key challenges and considerations for leaders of these teams, it is obvious that leading a software engineering team is a unique and challenging responsibility, requiring a range of technical, communication, and management skills.


Leading data science teams

Leading a data science team is once again different from leading a generic software engineering team in several key ways. There are several challenges that leaders may face when transitioning from leading a software engineering team to a data science team. Some of the key challenges include:

  1. Different decision-making processes: Data science teams often rely on data-driven decision making, which can be different from the decision-making processes used in software engineering teams. One key aspect of data-driven decision making in a data science team is the ability to efficiently explore and analyze large amounts of data. In many cases, the data sets that data science teams work with can be quite large, and it can be challenging to effectively analyze and extract insights from them. Leaders may need to learn how to effectively use data to inform their decisions and communicate the results of their analyses to their teams and stakeholders.
  2. Different technical skills: Data science teams typically require a different set of technical skills than software engineering teams. While software engineering teams focus on building and maintaining software applications, data science teams focus on collecting, analyzing, interpreting, and modelling data. To efficiently explore and analyze large data sets, data science teams often use a variety of tools and techniques. One common approach is to use big data technologies, such as Hadoop or Spark, which are designed to handle large data sets and enable distributed processing. These technologies can help data science teams to quickly and efficiently process and analyze data, even when it is too large to fit on a single machine. In addition to big data technologies, data science teams may also use specialized data exploration and visualization tools. These tools can help team members quickly and easily visualize and understand data, and identify patterns and trends that may not be immediately apparent. As a result, leaders may need to gain new technical skills, such as data analysis and machine learning, in order to effectively lead their teams.
  3. Different team structures: Data science teams may have different structures and workflows than software engineering teams. For example, data science teams may be more decentralized, with team members working on a variety of different projects at the same time. This can be a challenge for leaders who are used to more traditional, hierarchical team structures.
  4. Different cultural norms: Data science teams may also have different cultural norms and expectations than software engineering teams. For example, data science teams may place a greater emphasis on collaboration and cross-functional teamwork, while software engineering teams may be more focused on individual contributions. Leaders may need to adapt to these different cultural norms in order to effectively lead their teams. Data science teams often work closely with other teams and departments within an organization, such as product management, marketing, and sales. As the leader of a data science team, you will need to be able to effectively collaborate with these other teams, and be able to bridge the gap between technical and non-technical domains.
  5. Different stakeholder expectations: Data science teams may have different stakeholders with different expectations than software engineering teams. For example, data science teams may be more closely aligned with business functions, and may need to work more closely with business leaders to understand their needs and goals. Leaders may need to adapt to these different stakeholder expectations. As the leader of a data science team, you will need to be able to analyze large sets of data and use statistical and machine learning techniques to extract insights and inform business decisions. This requires a strong understanding of data analysis and visualization tools, as well as an ability to communicate complex technical concepts to non-technical stakeholders.
  6. Different team dynamics: Data science teams may have different team dynamics and communication styles than software engineering teams. For example, data science teams may involve more collaboration and cross-functional teamwork, which can require different leadership approaches and communication styles. Leaders may need to adapt to these different team dynamics in order to effectively lead their teams.
  7. Different tools and technologies: Data science teams may use different tools and technologies than software engineering teams. For example, data science teams may rely more heavily on data analysis and visualization tools, while software engineering teams may use more traditional software development tools. Leaders may need to learn how to effectively use these different tools and technologies in order to lead their teams effectively.
  8. Managing data privacy and security: Data science teams often work with large volumes of sensitive data, and may need to ensure that this data is handled securely and in compliance with relevant regulations. Leaders may need to learn about data privacy and security best practices in order to effectively manage these risks.
  9. Managing team members with diverse backgrounds: Data science teams may have team members with diverse backgrounds and expertise, including computer science, statistics, mathematics, and domain-specific knowledge. Leaders may need to learn how to effectively manage and motivate team members with these diverse backgrounds.
  10. Staying up to date with the latest engineering in the field: The field of data science is rapidly evolving, with new tools and technologies emerging constantly. Leaders may need to make a concerted effort to stay up to date with these developments in order to effectively lead their teams. This may require continuous learning and professional development efforts, such as attending conferences, taking online courses, and reading academic as well as industry publications. While it is difficult to quantify exactly how fast the field of data science is progressing, as it is a broad and rapidly-evolving field that encompasses a wide range of technologies and techniques. However, it is clear that the field is experiencing rapid growth and change. One way to measure the pace of change in the field is to look at the growth in the number of job openings for data scientists. According to data from the U.S. Bureau of Labor Statistics, employment of data scientists is projected to grow by 11% between 2019 and 2029, which is much faster than the average for all occupations. This rapid growth is a reflection of the increasing demand for data-driven insights and decision-making in a wide range of industries. Another way to measure the pace of change in the field is to look at the rate of development of new technologies and techniques. For example, the field of machine learning, which is a key component of data science, has seen rapid progress in recent years. In the past decade, there have been significant advances in machine learning algorithms, such as deep learning and reinforcement learning, which have enabled new applications and capabilities. Overall, it is clear that the field of data science is experiencing rapid growth and change, and it is important for data science teams to stay up-to-date and continuously learn in order to remain competitive.

To sum up, leading a data science team comes with its own set of challenges, including the need to stay up to date with rapid advancements in the field and to have a strong understanding of technical aspects of data science. Data science projects can also be complex and interdisciplinary, requiring effective organization and management. In my next blog post, I will discuss strategies for building and leading a successful data science team, including how to overcome these unique challenges.


Leave a Comment

GPTChat-aided Blogging

This post details how the earlier blog post was written.

My prompt

Here it goes!

Hi, can you help me writing blog post?

I was going through my previous posts, and I didn’t even remember I wrote them, and I forgot the books I’ve read. Revisiting my blog is like travelling back in time and rediscovering myself.

That’s a good start, can you add more depth to it?

Could you complete it with some examples?

Replace the “The Alchemist” example with the book “First Love, Last Rites”.

Add a few more example on writing skills.

Replace “The Great Gatsby” example with “The unconsciousness rules our lives”

No, stick with the previous version with “The great gatsby”

That’s not the correct version. Let’s continue with this one: Revisiting My Old Blog Posts: A Journey Through Time As I was going through my old blog posts, I was surprised to see how much I had forgotten about the books I had read and the things I had written about. It was like travelling back in time and rediscovering a part of myself that I had forgotten. For example, I came across a post where I had written about “First Love, Last Rites” by Ian McEwan. As I reread my thoughts on the book, I was struck by how much my interpretation of the story had changed since I first read it. When I first read the book, I was impressed by the vivid and poetic language used by the author. However, upon revisiting the book and my old blog post, I realized that the book had also touched upon deeper themes such as the power of first love and the fragility of life. Reading my old blog posts brought back memories of the time when I first read those books and shared my thoughts on them. It was interesting to see how my opinions and perspectives on certain topics had changed over the years. I realized that I had grown as a person, and my views on various subjects had evolved as I gained more knowledge and life experience. But revisiting my old blog posts was not just about reminiscing about the past. It was also an opportunity for me to see how far I had come as a reader and a writer. I could see the progress I had made and the things I had learned since I first started blogging. I was able to identify my strengths and weaknesses as a writer, and I gained a deeper understanding of my own writing style and voice. For example, I noticed that my writing had become more structured and focused over the years. In my earlier posts, I had a tendency to ramble and include tangential thoughts that distracted from the main message. However, as I reread my more recent posts, I could see that I had learned to be more concise and stick to the main points. In addition to personal growth, revisiting my old blog posts also reminded me of the value of consistency and dedication. I realized that I had been consistently reading and writing for a number of years, and that this dedication had played a significant role in my development as a reader and a writer. It reminded me that the journey of self-improvement is a long-term process, and that small steps taken consistently can lead to significant progress over time. Overall, revisiting my old blog posts was a meaningful and enriching experience. It reminded me of the importance of reflection and the value of looking back at where we have been in order to move forward. It also encouraged me to continue learning, growing, and writing in the future.

Attribute part of my growth or improvement to life experiences such as meeting different persons, traveling, and life changing events such as losing my mother to cancer.

please continue

Now add a few paragraphs to attribute part of my growth or improvement to experiences such as meeting different persons, traveling, and life changing events such as losing my mother to cancer.

continue

Add a new paragraph at the end with some action points, such as writing blog posts more frequently, and reading more books.

continue

Good enough, can you further polish this text to sound a little bit more sophisticated?

Great, am I allowed to publish this post on my website?

Can you make it longer with more examples?

continue

Rewrite “The Great Gatsby” paragraph with something from “Blink: the power of thinking without thinking”

continue

Leave a Comment

Migrating self-hosted WordPress

When I started this website in 2009, I was using Joomla. Then at some point I switch to WordPress because it was much easier to manage my posts. And it was hosted at one.com, where I initially registered this domain name. However, the network and service wasn’t really stable and I had a lot of hiccups back then. As a result, in 2017 I migrated the website to Scaleway.com, which offers very cheap virtual machines and small bare metal instances.

Another reason why I switched was because I was also running a real-time crypto price crawling and algorithmic trading service with a few servers, and since my website didn’t have a ton of visitors, I could put it together with other services. Back then a small instance cost only 5 euros per month, and I found it to be more cost-effective than other cloud computing providers like AWS, GCP or Vultr. As time went by, I had more resource intensive services running, such as staking pools which require large volumes. I ended up upgrading those tiny instances, and adding more resources from Scaleway. Then later when I phased out such services and reduced the number of instances, I couldn’t switch back to the smaller instances because either they didn’t exist any more or I couldn’t downgrade the volume size. So in the end I was paying a large monthly bill to run a small WordPress application.

And today I finally managed to spare a few hours reviewing the services I need, and decided it to migrate it back to one.com, which I was already paying anyway. As a side note, the price of one.com subscription also increase by 4x or 5x in 10 years.

Migrating WordPress alone is easy: backup all the files and database, then copy files to one.com, import the database, and setup the DNS on Cloudflare. I asked ChatGPT to draw me this flowchart:

During the process, I also tried setting up a local LAMP+WP service with docker-compose (this repo is very useful: https://github.com/nezhar/wordpress-docker-compose) and tried exporting my posts into GoHugo format. In the end I believed it was too much hassle and not worth it.

A few issues I encountered during the migration process:

  1. .htaccess files should be set up properly, otherwise WP complains about 404 for sub pages.
  2. Some WP plugins were outdated (code highlighting, for example), so it caused some page rendering issues. When I replaced the plugin such issues are gone.
  3. File permissions has to be 744 or 755 for the wp-content/uploads folder.
  4. I used nslookup daoyuan.li ns01.one.com to get the IP of one.com’s server, and manually updated it on Cloudflare, since I prefer to use Cloudflare to manage my DNS. In the future I may need to automate this process, in case one.com moves my VM to another server. Or at lease I should monitor the output from ns01.one.com versus kara.ns.cloudflare.com.

Besides migrating WP, I also had a few other websites to migrate. But since they are all static websites, I used Cloudflare Pages to host them and assigned the domains to the corresponding Pages project.

I probably spent 3 to 4 hours migrating everything, and double checking everything works fine. Then I went ahead and terminated all instances, elastic IPs and volumes on Scaleway. That would save me a couple of hundred euros a year. Not bad ROI!

Leave a Comment

Notes to myself: RTX 2070, Cuda, cudnn, caffe, and faceswap

Install NVIDIA driver for RTX 2070: https://www.geforce.com/drivers

Install CUDA 10.0: https://developer.nvidia.com/cuda-downloads

DO NOT re-install the drivers suggested by the CUDA installer:
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?
(y)es/(n)o/(q)uit: n

CuDNN:

cuDNN Runtime Library for Ubuntu18.04 (Deb)

cuDNN Developer Library for Ubuntu18.04 (Deb)

Caffe:

CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_cublas_device_LIBRARY (ADVANCED)
linked by target “caffe” in directory /home/dli/Projects/caffe/src/caffe

Upgrade cmake:

face_swap

Leave a Comment

Configure Selenium and Chrome to use Tor proxy

I’ve been trying to configure Selenium and Chrome to use Tor as proxy and constantly getting error messages like the following:

WebDriverException: Message: invalid argument: cannot parse capability: proxy
from invalid argument: Specifying ‘socksProxy’ requires an integer for ‘socksVersion’
(Driver info: chromedriver=2.44.609545 (c2f88692e98ce7233d2df7c724465ecacfe74df5),platform=Mac OS X 10.14.0 x86_64)

In the end I have to use HTTP proxy instead of SOCKS.

Install and start Tor:

Install privoxy:

Configure privoxy (vi /usr/local/etc/privoxy/config) to chain it with Tor:

Start privoxy by default on port 8118:

Check if your traffic is proxied:

Now you should see a different IP than your real one.

Leave a Comment

Docker Postgres “PANIC: could not locate a valid checkpoint record”

It seemed that my Postgres database was not properly shut down when rebooting and when I tried to use docker-compose to start it again, the following message was shown in docker logs:

To fix this, first shut down this container (docker-compose down), then start the container in interactive mode:

After the transaction log is reset, everything should be fine. Now you can start your containers again (docker-compose up -d).

Leave a Comment

Be careful with market orders

I was testing my algorithmic trading program just now, and experienced an very important issue with market orders.

TL;DR DO NOT USE MARKET ORDERS UNLESS ABSOLUTELY CONFIDENT!!!

Market orders ensure immediate execution, without guarantee of the price of the order. As a result, your order may be executed with a much higher price than you’ve expected, especially when the trading volume is low and the spread is large. In my case, I ended up paying +5% more than the price I’m willing to pay…

Lessons should be learned.

Leave a Comment

Exporting and Importing Elasticsearch Indicies

In my project I need to run some local tests with data from a production elasticsearch cluster, so I exported data from the production server and imported to my local cluster. This can also be used when backing up and restoring data. Here’re the instructions.

Before you start, check out the official documentation: Snapshot and Restore.

Backing up/exporting data:

  1. Modify your eleasticsearch configuration file (normally elasticsearch.yml) and add a path.repo line, for example:
  2. Make sure this path has the correct permissions so that elasticsearch can read and write.
  3. Create snapshot:
  4. Copy the files in the configured location to your local machine.

Restoring/importing data:

  1. Modify your local elasticsearch configuration similarly like step 1 when backing up.
  2. Place the snapshot files to the repo path.
  3. Close your indices:
  4. Import data:
  5. Reopen your indices:

It is important that your the elasticsearch version on your importing party is compatible with the one exporting data, i.e., in this case your local machine has to be the same version or newer. If not, you need to upgrade elasticsearch first. The official documentation says:

The information stored in a snapshot is not tied to a particular cluster or a cluster name. Therefore it’s possible to restore a snapshot made from one cluster into another cluster. All that is required is registering the repository containing the snapshot in the new cluster and starting the restore process. The new cluster doesn’t have to have the same size or topology. However, the version of the new cluster should be the same or newer than the cluster that was used to create the snapshot.

2 Comments

Installing Theano and CUDA on Mac OS X

I started trying Theano today and wanted to use the GPU (NVIDIA GeForce GT 750M 2048 MB) on my Mac. Here’s a brief instruction on how to use the GPU on Mac, largely following the instructions from http://deeplearning.net/software/theano/install.html#mac-os.

Install Theano:

Download and install CUDA: https://developer.nvidia.com/cuda-downloads

Put the following lines into your ~/.bash_profile:

Note that the PATH line is necessary. Otherwise you may see the following message:

ERROR (theano.sandbox.cuda): nvcc compiler not found on $PATH. Check your nvcc installation and try again.

Configure Theano:

Test if GPU is used:

A more realistic example: